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ABSTRACT 


Without  the  protection  of  atmosphere,  space  systems  have  to  mitigate  radiation  ef¬ 
fects.  Several  different  technologies  are  used  to  deal  with  different  radiation  effects  in 
order  to  keep  the  space  device  work  properly.  One  of  the  radiation  effects  called  Single 
Event  Upset  (SEU)  can  change  the  state  of  a  component  or  data  on  the  bus.  A  single  er¬ 
ror  is  possible  to  cause  a  system  failure  if  it  is  not  corrected. 

Besides  error  correction,  a  space  system  also  needs  the  flexibility  to  be  modified 
or  upgraded  easily.  Consequently,  the  idea  of  having  a  TMR  design  instantiated  in  an 
EPGA  to  construct  a  Configurable  Eault-Tolerant  Processor  (CETP)  developed.  The 
TMR,  which  runs  one  program  in  three  identical  soft-core  processors  with  voters,  is  a 
scheme  used  to  mitigate  an  SEU.  The  full  design  of  TMR  running  in  an  EPGA  functions 
as  a  System-On-a-Chip  (SOC).  Both  soft-core  processor  and  EPGA  offer  the  CETP  a 
great  flexibility  to  be  reconfigured. 

A  complete  TMR  design  includes  some  fundamental  components  besides  proces¬ 
sors  and  voters  such  as  the  Reconiler,  Interrupt,  and  Error  Syndrome  Storage  Device 
(ESSD).  These  components  have  their  unique  function  in  the  TMR  design.  They  are  cre¬ 
ated  and  simulated.  Eactors  that  affect  test  bench-settings  like  processor  pipelining  are 
important  to  always  keep  in  mind.  A  component  is  designed  to  implement  proper  func¬ 
tions  first.  Then  it  is  revised  to  work  with  the  processor  and  memory.  The  full  design  for 
the  TMR  in  this  thesis  proves  its  ability  to  detect  and  correct  an  SEU.  The  follow-on  re¬ 
search  suggested  is  to  improve  the  efficiency  and  performance  of  this  design. 
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EXECUTIVE  SUMMARY 


Space  systems  suffer  radiation  effeets  in  space.  These  radiation  effeets  oeeur  ran¬ 
domly  and  are  hard  to  predict.  The  eombination  of  effeets  ean  destroy  a  system  or  make 
it  functionless.  Therefore,  different  methods  are  presented  to  proteet  spaee  deviees  sueh 
as  radiation  hardened  or  fault  tolerant  systems.  Space  systems  are  usually  tested  and 
simulated  several  times  before  launehing  in  order  to  minimize  the  probability  of  losing 
control  of  it  after  launch. 

The  Single  Event  Upset  (SEU)  is  a  radiation  effect  which  causes  a  bit  flipping  in  a 
deviee.  This  effect  is  not  strong  enough  to  destory  a  system  but  may  cause  a  series  of  er¬ 
rors  that  finally  make  the  system  unusable.  This  error  should  be  corrected  in  time  and 
Triple  Modular  Redundaney  (TMR)  is  one  of  the  schemes  to  mitigate  this  problem. 

The  TMR  design  selected  for  the  CETP  is  to  instantiate  three  soft-eore  processors 
with  some  other  eomponents  into  a  fault  tolerant  Eield  Programmable  Gate  Array 
(EPGA).  The  EPGA  is  easily  reconfigured  and  the  soft-core  processor  has  great  flexibil¬ 
ity  to  be  programmed  or  modified.  Those  features  give  a  TMR  design  the  ability  to  be 
maintained  and  upgraded.  The  proeessor  chosen  for  TMR  design  is  a  16-bit  Redueed  In¬ 
struction  Set  Computer  (RISC)  proeessor  named  KDEX.  It  is  a  5-stage  pipelined  proces¬ 
sor  with  Harvard  arehitecture.  The  pipeline  affeets  the  settings  of  a  test  beneh  and  the  in- 
fiuenee  is  discussed  in  this  thesis.  A  full  simulation  for  all  instruetions  is  introdueed  to 
help  understand  functions  of  different  operation  eodes. 

To  stop  an  error  being  propagated,  the  TMR  has  to  correct  the  error  once  it  is  de- 
teeted.  Three  proeessors  in  TMR  should  always  execute  the  same  instruction  and  all  ac¬ 
tions  should  be  identical.  Any  inconsistency  found  among  these  three  proeessors  will  be 
considered  as  an  error.  Then  the  TMR  needs  to  have  a  funetion  to  stall  the  current  opera¬ 
tion  and  correet  errors  in  proeessors.  Eor  error  detection  and  correction,  the  following 
four  major  eomponents  are  designed;  majority  bit  voter.  Reconciler,  Interrupt,  and  Error 
Syndrome  Storage  Device  (ESSD). 
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Voters  are  conneeted  at  output  pins  or  buses  of  proeessors.  Therefore  all  output 
signals  are  voted.  The  majority  bit  voter  takes  two  out  of  three  identieal  signals  as  the 
output  signal  and  reports  the  oeeurrenee  of  an  error  if  one  of  the  three  is  different.  The 
voter  is  able  to  correct  an  error  immediately  and  indicate  where  the  error  is.  Construction 
of  three  processors  with  voters  called  the  TMR  Assembly. 

Due  to  different  architectures  between  the  processor  and  memory,  a  Reconciler  is 
responsible  for  coordinating  the  difference  between  these  two  architectures.  The  solution 
is  to  run  memory  twice  as  fast  as  the  processor  and  let  the  Reconciler  route  data  of  mem¬ 
ory.  The  memory  acts  as  an  instruction  memory  at  the  first  half  of  processor  clock  cycle 
and  acts  as  a  data  memory  at  the  other  half  cycle.  Thus,  the  processor  thinks  it  is  con¬ 
nected  with  two  different  memories.  The  Reconciler  in  TMR  for  this  thesis  is  purely  a 
reconciler  and  does  nothing  directly  related  with  error  detection  or  correction.  This  pu¬ 
rity  makes  it  independent  of  other  components. 

When  an  error  is  detected  by  voters,  the  Interrupt  starts  the  Interrupt  Service  Rou¬ 
tine  (ISR).  In  order  to  store  and  read  properly,  this  component  has  to  run  as  fast  as  the 
Reconciler.  The  Interrupt  replaces  the  current  instruction  on  the  bus  with  a  TRAP  in¬ 
struction  when  an  error  occurs.  This  TRAP  instruction  will  be  fetched  by  all  processors 
and  executed.  The  ISR  is  a  special  program  designed  to  correct  inconsistency  of  contents 
in  registers  between  three  processors.  At  the  end  of  ISR,  the  Interrupt  injects  a  Jump  in¬ 
struction  into  instruction  bus  and  leads  processors  back  to  the  normal  operation. 

The  ESSD  latches  some  specific  data  from  the  buses  when  an  error  occurs.  These 
specific  data  are  called  the  error  syndrome,  which  is  unique  for  one  specific  error.  Error 
syndromes  are  very  useful  for  health  checking  or  error  debugging  to  a  system.  In  order  to 
latch  data  at  the  correct  timing,  the  ESSD  has  to  run  as  fast  as  the  Reconciler  (or  Inter¬ 
rupt).  The  ESSD  does  not  pass  its  data  to  the  Reconciler  when  storing.  Instead,  it  takes 
over  the  whole  memory  and  saves  error  syndromes  while  the  processors  are  deliberately 
stalled. 

The  full  design  consolidates  all  components  to  construct  a  complete  TMR  design. 
The  design  was  simulated  and  its  function  was  proved  in  this  thesis.  This  premiere  de- 
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sign  gives  a  big  picture  of  how  errors  are  detected  and  corrected.  Furthermore,  interac¬ 
tion  between  different  components  is  one  of  the  important  concepts  to  learn.  The  full  de¬ 
sign  has  four  different  clocks.  The  Reconciler,  Interrupt  and  ESSD  are  using  the  same 
clock  speed  since  none  of  them  needs  the  signal  from  another.  The  other  three  clocks  are 
KDLX  clock,  memory  clock  and  one  special  clock  for  the  latch. 

For  further  research,  extra  circuits  or  components  are  needed  to  improve  the  abil¬ 
ity  of  error  correction  on  different  components.  Considering  an  error  generated  in  the 
Reconciler,  the  error  may  never  be  found  and  data  stored  to  memory  is  always  wrong. 
Reinforcing  reliability  of  some  components  is  something  that  needs  to  be  considered. 

The  current  design  may  be  modified  to  meet  the  requirements  of  advanced  functions.  Fi¬ 
nally,  searching  for  a  better  processor  to  enhance  the  performance  is  required  as  well. 
Commercial  processors  usually  come  with  a  software  package  and  have  better  customer 
support.  OpenCores  that  people  share  to  the  public  are  free  but  a  user  needs  to  have 
backgrounds  of  coding  in  order  to  realize  the  core. 
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I.  INTRODUCTION 


An  electronic  device  in  space  environment  suffers  an  extreme  challenge  to  its  re¬ 
liability  due  to  the  lack  of  atmosphere  and  huge  temperature  variation.  Without  protec¬ 
tion  of  atmosphere,  a  space  system  is  exposed  in  a  very  unique  circumstance  which  con¬ 
tains  cosmic  rays  (85%  protons,  14%  alpha  particles  and  1%  heavy  Nuclie),  solar  events 
(X-rays,  heavy  ions  and  protons)  and  trapped  radiation  (electrons  and  protons  trapped  in 
magnetic  field  of  earth,  called  Van  Allen  Belt).  Thus,  radiation  effects  on  a  space  elec¬ 
tronic  system  become  one  of  the  most  important  issues  that  need  to  be  solved.  Those  ef¬ 
fects  include  Total  Dose  Effects  and  Single  Event  Effects. 

A  number  of  methods  have  been  presented  to  mitigate  radiation  effects.  Elsing 
soft-core  Triple  Modular  Redundancy  (TMR)  on  a  Eield  Programmable  Gate  Array 
(EPGA)  provides  a  practical  solution  to  Single  Event  Effects  which  is  low  cost  and  offers 
flexibility  to  be  reconfigured  and  easily  developed.  The  Configurable  Eault-Tolerant 
Processor  (CETP)  is  a  system  based  on  this  concept  utilizing  Commercial-Off-the-Shelf 
(COTS)  technology  and  features  of  TMR  soft-core  microprocessors  on  EPGAs  as  a  Sys- 
tem-On-a-Chip  (SOC). 

A.  RADIATION  EFFECTS 

Radiation  effects  on  a  space  system  vary  depending  on  different  altitude,  location 
and  solar  events.  For  example,  the  inner  Van  Allen  Belt,  from  650  km  to  6300  km  above 
Earth’s  surface,  is  composed  mostly  of  protons  about  10  to  15  MeV  (1  MeV  =  10^  eV, 

1  electronvolt  ~  1.6x10'^^  J).  As  a  satellite  travels  in  Low-Earth  Orbit  (LEO),  from  160  to 
6000  km,  it  will  have  many  chances  to  be  affected  by  protons.  The  scheme  used  to  solve 
radiation  problems  on  this  satellite  must  be  different  from  the  one  that  travels  in  geosta¬ 
tionary  orbit,  whose  altitude  is  35,780  km.  Since  a  satellite  in  geostationary  orbit  has  al¬ 
most  no  protection  by  Earth,  it  needs  to  be  more  radiation-hardened  (RADHARD)  or  ra¬ 
diation-tolerant.  Major  effects  caused  by  radiation  are  Total  Dose  Effects  and  Single 
Event  Effects  (SEE)  including  Single  Event  Phenomenon  (SEP),  Single  Event  Upset 
(SEU),  Single  Event  Latchup  (SEE)  and  Single  Event  Burnout  (SEB)  [1]. 
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1. 


Total  Dose  Effects 


Total  Dose  Effects  refer  to  total  radioactive  particles  that  a  device  accumulates 
over  its  lifetime.  This  accumulation  degrades  the  performance  until  the  device  becomes 
totally  useless.  The  general  solution  to  mitigate  these  effects  so  far  is  using  radiation¬ 
hardening  or  shielding  techniques,  but  such  methods  can  only  extend  the  end  of  life  of  the 
chip,  not  totally  eliminate  this  problem. 

2.  Single  Event  Phenomenon  (SEP) 

Single  Event  Phenomenon  is  the  situation  where  a  transistor  resets  to  its  original 
state  due  to  the  particle  passing  through.  This  causes  unpredictable  results  and  may  or 
may  not  affect  operation  of  a  system. 

3.  Single  Event  Upset  (SEU) 

Single  Event  Upset  is  a  logical  bit  changing  because  of  the  radiation.  A  bit  flip¬ 
ping  may  cause  a  chain  reaction  and  consequently  result  in  an  unrecoverable  error  of  a 
system.  TMR  is  a  mitigation  scheme  using  three  identical  processors  to  run  a  same  in¬ 
struction  set  and  voting  all  results  to  detect  and  correct  such  an  error. 

4.  Single  Event  Latchup  (SEE)  and  Single  Event  Burnout  (SEB) 

Single  Event  Eatchup  occurs  when  a  parasitic  transistor  is  formed  by  a  spurious 
current  spike  like  heavy  cosmic  ray  [2].  This  puts  a  circuit  into  a  high-operating-current 
mode  that  has  to  be  cleared  by  power  off-on  reset.  Hard  errors  can  drag  the  bus  voltage 
down  or  even  burn  out  the  circuit.  This  is  called  Single  Event  Burnout. 


Some  techniques  used  to  mitigate  radiation  effects  are  shown  in  Table  1. 


Radiation  Effects 

Mitigation  Techniques 

Total  Dose 

Radiation-Hardening 

Silicon-On-Sapphire 

Silicon-On-Insulator 

Thin-Gate-Oxide 

Shielding 

Single  Event  Eatchup  (SEE) 

Radiation  Hardening 

Guard  Rings 

Single  Event  Upset  (SEU) 

Quadded  Eogic 

Software  Eault  Tolerance 

Tripple  Modular  Redundancy 

Table  1.  Radiation  Effects  and  Mitigation  (Erom  Ref.  [1].) 
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B,  FIELD  PROGRAMMABLE  GATE  ARRAY  (FPGA) 


Sequential  programmable  devices  are  composed  of  gates  and  flip-flops  and  are 
able  to  perform  a  variety  of  functions.  Three  major  types  of  sequential  programmable 
devices  are  the  Sequential  (or  simple)  Programmable  Logic  Device  (SPLD),  the  Complex 
Programmable  Logic  Device  (CPLD)  and  the  Field  Programmable  Gate  Array  (FPGA). 

A  SPLD  which  integrates  the  AND-OR  array  and  flip-flops  is  the  smallest  and  the  cheap¬ 
est  form  of  programmable  logic.  A  CPLD  is  similar  to  a  SPLD  except  that  it  is  a  collec¬ 
tion  of  individual  PLDs.  Interconnections  between  PLDs  are  programmable  as  well.  A 
typical  CPLD  is  equal  to  2  to  64  SPLDs.  An  FPGA  consists  of  logic  cells  surrounded  by 
a  ring  of  programmable  I/O  blocks.  Each  cell  is  able  to  implement  a  logic  function  which 
is  done  by  programming  and  all  interconnections  between  cells  are  also  programmable. 
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Figure  1.  Composition  of  FPGA  (From  Ref.  [3].) 


Unlike  the  FPGA,  PLDs  need  to  be  physically  removed  from  a  system  and  repro¬ 
grammed  by  specific  methods.  This  disadvantage  makes  a  space  system  made  of  these 
devices  almost  impossible  to  be  modified  or  upgraded.  Programmed  circuits  can  be  eas¬ 
ily  instantiated  on  a  FPGA  without  any  specific  requirements.  This  feature  reduces  time- 
to-market  of  a  product  as  well.  Comparing  with  other  device,  FPGAs  are  less  power  con¬ 
suming,  less  expensive,  have  large-scale  advantages  of  programmable  logic  and  high 
flexibility  [4]. 

The  FPGA  selected  for  CFTP  is  the  Virtex  XCV800,  a  member  in  Virtex  FPGA 
family  of  Xilinxi.  Table  2  shows  the  specification  of  some  Virtex  family  members.  A 
1  Xilinx  is  a  registered  trademark  of  Xilinx  Corporation. 
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CLB  is  a  Configuration  Logic  Block  which  can  be  configured  to  represent  any  4-input 
switehing  funetion  to  define  a  design.  CLBs  are  also  eonnected  to  each  other  by  pro¬ 
gramming  as  part  of  the  design  process.  A  design  can  be  parsed  to  multiple  CLBs  for  full 
implementation  if  it  is  too  large  to  fit  into  a  single  CLB  [5]. 


Device 

System  Gates 

CLB  Array 

Logic  Cells 

Maximum 
Available  I/O 

Block  RAM 
Bits 

Maximum 
SelectRAM+'“  Bits 

XCV50 

57,906 

16x24 

1,728 

180 

32,768 

24,576 

XCV100 

108,904 

20x30 

2.700 

180 

40,960 

38.400 

XCV150 

164,674 

24x36 

3.888 

260 

49,152 

55,296 

XCV200 

236,666 

28x42 

5,292 

284 

57,344 

75,264 

XCV300 

322,970 

32x48 

6,912 

316 

65,536 

98.304 

XCV400 

468,252 

40X60 

10,800 

404 

81,920 

153,600 

XCV600 

661,111 

48x72 

15,552 

512 

98.304 

221,184 

XCV800 

888,439 

56x84 

21,168 

512 

114,688 

301 ,056 

XCV1000 

1.124.022 

64x96 

27,648 

512 

131,072 

393,216 

Table  2.  Virtex  FPGA  family  members  (From  Ref.  [6].) 


One  of  the  reasons  for  choosing  this  FPGA  was  because  its  pin  eonfiguration  is  a 
fiat-paek.  This  type  of  interface  is  spaceflight  eertified  and  has  been  used  in  spaee  for 
years.  Some  of  the  newest  and  largest  FPGAs  nowadays  are  using  ball  grid  array  (BGA) 
connections  which  are  not  only  difficult  to  be  attaehed  to  a  printed  cireuit  board,  but  also 
not  qualified  for  spaee  applieations  [5]. 

C.  SOFT-CORE  PROCESSORS 

A  soft-eore  processor  is  a  set  of  source  eodes  expressed  in  hardware  deseription 
language  (HDL)  whieh  express  the  behavior  of  a  real  proeessor.  It  is  a  synthesizable 
HDL  design  and  has  no  explieit  hardware  realization.  This  type  provides  great  flexibility 
but  has  limitation  of  performance  and  predietability.  A  hard-core  proeessor,  on  the  other 
hand,  provides  high  performanee  but  is  not  flexible. 

Sinee  a  soft-eore  processor  ean  be  easily  instantiated  on  a  FPGA,  a  designer  has  a 
wide  range  of  seleetions  and  combinations.  A  soft  core  can  be  optimized  for  different 
FPGA  sizes  and  charaeteristies  to  improve  performanee,  giving  the  most  cost-effieient 
solution  for  target  applieations.  A  hard  core  whieh  has  speeifie  funetion  bloeks  needs  to 
work  with  speeial  FPGA  device.  The  need  for  these  specific  FPGAs  is  limited;  therefore 
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they  do  not  have  the  large-seale  manufaeturing  benefits  whieh  forees  vendors  to  support 
few  FPGA  packages.  Another  disadvantage  of  using  a  hard  core  is  if  a  problem  is  found 
in  one  version,  all  specific  FPGAs  supporting  that  version  have  to  be  revised.  Hard  cores 
are  good  for  big  and  commonly  used  functions  like  a  RAM  [4]. 

The  soft-core  processor  chosen  for  this  iteration  of  the  CFTP  is  a  16-bit  Reduced 
Instruction  Set  (RISC)  KDLX  processor.  The  DLX  processor  is  coded  in  HDL  and  de¬ 
scribed  in  Hennessy  and  Patterson’s  Computer  Architecture:  A  Quantitative  Approach 

[7] .  The  KDLX  processor  is  a  revision  of  DLX  processor  by  Dr.  Kenneth  Clark  that  was 
used  on  complex  digital  systems  to  predict  SEU  tolerance  as  described  in  his  dissertation 

[8] .  Therefore,  one  of  the  reasons  to  use  this  processor  is  that  it  had  been  designed  and 
tested. 

D,  TRIPLE  MODULAR  REDUNDANCY  (TMR) 

Once  a  system  is  launched  to  space,  it  is  hard  and  expensive  to  maintain  it.  In  or¬ 
der  to  correct  errors  caused  by  radiation,  different  ways  have  been  presented  and  actually 
used  in  space.  Using  RADHARD  devices  or  fault-tolerant  designs  are  the  most  common 
ways.  TMR  is  one  of  the  solutions  to  make  a  circuit  be  able  to  tolerate  occurrence  of  an 
error  and  correct  it.  This  is  done  by  software  so  it  is  simple  and  low-cost.  Taking  advan¬ 
tage  of  the  FPGA,  the  TMR  instantiated  inside  becomes  easily  modified  and  upgraded  in 
the  future. 

Basically,  a  TMR  system  is  composed  of  three  identical  devices  and  voting  logic 
as  shown  in  Figure  2.  The  voting  logic  is  a  majority  voter  which  takes  the  majority  of  the 
inputs  to  be  the  output  value.  Since  Devices  B  and  C  are  replication  of  Device  A  and  they 
all  accept  the  same  input  value,  the  outputs  of  A,  B  and  C  should  be  consistent  in  theory. 
Due  to  radiation  effects  in  space,  one  of  these  three  devices  may  have  an  error  inside  and 
generate  a  different  output.  This  inconsistency  will  be  caught  and  corrected  by  voting 
logic.  Thus,  the  voted  output  is  always  a  correct  value  under  the  assumption  of  a  single 
error. 
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Input 


Device  A 


Device  B 


Device  C 


Output  A 


Output  B 


Output  C 


Voted  Output 
Error  Corrected  Signal 


Figure  2.  Basic  TMR  Concept  (After  Ref  [1].) 


When  the  TMR  concept  is  applied  to  a  microprocessor,  it  is  illustrated  in  Figure  3. 
All  output  signals  of  the  CPU  are  voted;  therefore  no  error  should  exist  at  outputs  of  vot¬ 
ers.  Any  error  that  occurs  represents  that  one  of  the  CPUs  has  an  error  inside.  If  that  er¬ 
ror  is  not  corrected  by  some  way,  it  may  result  in  more  errors  and  finally  become  unre¬ 
coverable.  Thus,  the  Error  Encoder  in  Figure  3  is  a  device  that  will  analyze  error  signals 
offered  by  voters  and  find  out  which  CPU  generates  the  error.  Once  the  faulty  CPU  is 
identified,  some  extra  circuits  will  interrupt  all  three  processors  and  correct  that  error. 
When  a  simple  circuit  acting  as  a  system  is  instantiated  on  a  chip  (e.g.,  FPGA),  it  is 
called  a  system  on  a  chip  (SOC).  Recall  that  a  soft  core  is  not  efficient  for  complex  func¬ 
tions;  therefore  the  memory  block  in  Figure  3  is  an  external  chip. 


Common 

Inputs 


To  output 
Interface 


Figure  3.  Microprocessor  TMR  Concept 


The  CFTP  implements  these  basic  ideas.  The  circuits  to  do  interruption  and  cor¬ 
rect  an  error  are  quite  complicated.  All  concepts  for  constructing  a  complete  TMR  de¬ 
sign  will  be  explained  in  the  rest  of  chapters. 
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E.  ORGANIZATION 

Chapter  II  reviews  previous  theses  and  gives  other  information  related  to  the 
CFTP.  Chapter  III  describes  the  testing  environment  and  introduces  the  software  used  in 
the  thesis.  Chapter  IV  discusses  the  function  and  features  of  the  KDLX.  Simulations  of 
all  instructions  for  the  KDLX  are  shown  in  this  chapter.  Chapter  V  goes  over  the  design 
of  voter  logic  in  previos  theses  then  constructs  the  TMR  Assembly  and  simulates  it. 
Chapter  VI  describes  the  Reconciler  used  to  coordinate  different  architectures  in  this  de¬ 
sign.  Chapter  VII  is  a  description  of  the  Interrupt  module  designed  for  correcting  errors 
in  the  registers.  Chapter  VIII  shows  the  simulation  of  the  full  design  without  any  cir¬ 
cuitry  to  handle  the  reporting  of  errors.  This  chapter  explains  the  function  of  the  ISR  and 
how  different  components  work  together.  Chapter  IX  introduces  the  component  used  to 
store  necessary  data  for  future  analysis  when  an  error  occurs.  This  component  is  Error 
Syndrome  Storage  Device  and  its  function  of  the  full  design  is  verified  in  this  chapter. 
Chapter  X  contains  conclusions  and  topics  for  follow-on  research. 

F.  ADDITIONAL  DOCUMENTATION 

Appendix  A  contaions  all  schematics,  test  benches,  and  simulation  results  dis¬ 
cussed  in  this  thesis.  Some  the  the  figures  are  zoomed  in  to  provide  better  views  of  the 
small  numbers  on  the  buses.  Appendix  B  is  the  description  of  the  whole  instruction  set 
for  the  KDLX.  Appendix  C  contains  VHDL  codes  for  all  components  designed  in  this 
thesis.  The  VHDL  files  for  the  KDLX  processor  are  also  included. 

G.  CHAPTER  SUMMARY 

This  chapter  has  given  fundamental  understanding  of  radiation  effects,  FPGA  and 
soft-core  processors.  The  general  concept  of  a  TMR  design  has  been  introduced  as  well. 
Previous  thesis  work  of  CFTP  will  be  reviewed  in  next  chapter  and  the  TMR  technique 
for  correcting  an  error  will  also  be  described.  Reading  old  thesis  work  is  always  a  good 
starting  point  of  learning.  Experience  will  be  shared  and  direction  for  following  research 
will  be  pointed  out. 
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II.  TMR  REVIEW  IN  PREVIOUS  WORK 


To  construct  a  CFTP  design  is  a  really  eomplex  work  and  needs  a  signifieant 
amount  of  time  to  finish.  In  order  to  have  a  flawless  design,  lots  of  eonditions  need  to  be 
eonsidered  and  all  problems  should  be  solved  in  a  reasonable  way.  Seleeting  eomponents 
may  take  few  days  or  months  depending  on  how  mueh  data  or  information  is  eolleeted. 
Deeisions  may  still  be  changed  at  the  last  minute  due  to  some  unpredietable  situations  or 
inevitable  faetors.  Any  ehange  in  the  final  design  on  a  eomponent  sometimes  will  cause 
a  series  of  modifieations  to  others.  It  is  obvious  that  building  a  fully-funetional  CFTP 
does  take  much  effort  and  designers  have  to  really  understand  how  eireuits  relate  eaeh 
other  in  order  to  revise  or  debug  it.  Unfortunately,  graduate  students  at  Naval  Postgradu¬ 
ate  Sehool  only  stay  a  short  amount  of  time.  A  big  design  like  CFTP  is  ehopped  into 
several  segments  and  assigned  to  different  students.  In  this  time  eonstraints,  students  not 
only  need  to  realize  what  previous  students  have  done  but  also  take  up  a  design  in  pro¬ 
gress.  Most  of  the  time,  students  pieking  up  the  segments  do  not  have  a  chanee  to  learn 
direetly  from  students  who  have  worked  on  this  design  before.  Thus,  the  thesis  beeomes 
an  important  interface  of  experienee  inheritance  between  generations  of  students. 

A,  LASHOMB’S  DESIGN 

Peter  A.  LaShomb  [1]  expressed  many  concepts  in  both  TMR  design  and  FPGA 
seleetion.  Traditional  solutions  for  radiation  effeets  were  introduced  ineluding  hardware 
redundaney,  like  Quadded  Logic,  and  software  improvement  for  fault  toleranee,  like  time 
redundancy  or  software  redundancy.  In  the  TMR  seetion,  RAD  HARD  and  COTS  were 
compared  in  availability,  performanee  and  eost.  Potential  benefits  of  those  two  were 
elearly  deseribed  as  well.  The  proeessor  used  in  his  TMR  design  was  KCPSM,  an  8-bit 
mieroeontroller.  It  was  free  downloaded  from  Xilinx’s  website  and  served  as  a  readily 
available  test-ease  proeessor  while  waiting  availability  of  other  high  performanee  proees- 
sors.  Construeting  and  testing  of  the  TMR  were  done  on  Xilinx  Foundation  series  soft¬ 
ware  whieh  was  available  at  Naval  Postgraduate  Sehool  (NPS).  Voters  and  an  error  en- 
eoder  were  designed  and  explained  in  detail.  Other  issues  ineluding  interrupt  routine  and 
memory/error  eontroller  were  left  as  follow-on  researeh. 
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In  the  FPGA  section,  different  FPGAs  were  compared  in  a  number  of  aspects. 

Five  major  parameters  for  choosing  a  good  FPGA  were  gate  count,  availability  of  hard¬ 
ware  and  software,  packages  (flat-pack  vs.  ball-grid-array),  re-programmablility  and  ra¬ 
diation  tolerance.  The  Xilinx  XCV800  was  chosen  as  the  candidate  at  that  time  for  future 
implementation. 

B,  EBERT’S  RESEARCH 

A  complete  CFTP  conceptual  design  presented  was  in  Dean  A.  Ebert’s  thesis  [9]. 
For  hardware  considerations,  his  thesis  discussed  why  specific  components  were  chosen 
and  how  chips  communicated  in  an  integrated  circuit.  More  detail  and  realistic  concepts 
about  FPGA  and  CFTP  configurations  were  described  than  before  and  chips  were  se¬ 
lected  based  on  a  number  of  space-environment  considerations.  Discussion  of  system 
memory  was  important  and  first  described  in  this  thesis.  Memory  configuration  control¬ 
ler,  functional  logic  and  glue  logic  were  also  new  ideas  never  talked  about  in  previous 
work.  The  TMR  circuitry  was  not  one  of  the  main  topics  in  his  research,  but  from  his 
work  one  can  visualize  the  external  connections  of  the  FPGA  and  understand  the  role  of 
TMR  in  the  CFTP  process.  Figure  4  illustrates  the  layout  of  the  board  he  developed. 


Figure  4.  CFTP  Conceptual  Diagram  (From  Ref.  [9].) 
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The  CFTP  will  be  launehed  into  LEO  orbit  on  two  satellites,  NPSAT-1  and  Mid- 
STAR-1,  in  2006.  How  the  Department  of  Defense  and  Navy  Space  Experiment  Review 
Board  (SERB)  and  the  Space  Test  Program  (STP)  Office  were  involved  with  these  two 
satellites  was  described  in  his  thesis.  Other  documents  related  to  design  descriptions  and 
requirements  of  the  STP  office  were  attached  as  appendixes  as  well. 

C.  JOHNSON’S  IMPLEMENTATION 

Steven  A.  Johnson  [5]  focused  his  work  on  TMR  design.  The  essential  compo¬ 
nents  to  make  a  circuit  be  fault-tolerant  were  identified.  Circuits  designed  in  Lashomb’s 
thesis  could  not  be  used  due  to  different  design  architecture  and  the  significant  upgrade  of 
computer-aided-design  software  employed.  Basic  concepts  for  constructing  a  TMR  cir¬ 
cuit  were  still  the  same,  but  implemented  in  a  different  way. 

KDEX,  a  16-bit  processor,  better  than  8-bit  KCPSM  processor,  was  the  processor 
used  in  Johnson’s  research.  His  design  consisted  of  tmra,  Interrup,  Error  Syndrome  Stor¬ 
age  Device  (ESSD)  and  Reconciler.  The  block  named  tmra  consists  of  three  KDEX 
processors  and  six  voters.  All  processor  output  signals  have  to  be  voted.  Interrup  was 
compiled  in  a  state  diagram  and  used  to  trigger  the  interrupt  service  routine  to  correct  an 
error  inside  the  KDEX.  ESSD  was  used  to  save  the  error  syndrome  in  order  to  offer  a  log 
file  for  analysis.  The  KDEX  is  a  Harvard  architecture  device  which  has  two  address 
buses  and  two  data  buses,  a  set  of  address  and  data  bus  for  instruction  memory  and  an¬ 
other  set  for  data  memory.  The  off-chip  memory  for  the  CETP  is  Von  Neumann  architec¬ 
ture.  The  Von  Neumann  architecture  has  only  one  address  bus  and  one  data  bus.  Due  to 
this  difference,  a  Reconciler  was  designed  to  coordinate  different  timing  constraints  in 
order  to  make  a  proper  read  and  write  on  memory.  The  difference  between  Harvard  and 
Von  Neumann  architecture  will  be  explained  again  while  introducing  KDEX  in  Chapter 
IV. 

Johnson’s  full  design  schematic  is  shown  in  Eigure  5.  The  memory  is  external  to 
EPGA  and  it  should  be  connected  to  Reconciler  located  at  the  top  left  corner.  Normally, 
tmra  communicates  with  Reconciler  in  order  to  access  memory.  Meanwhile,  the  syn¬ 
drome  data  is  latched  into  ESSD  regardless  of  an  error  occurring  or  not.  When  an  error 
occurs,  a  signal  will  be  sent  to  Interrup  and  starts  the  Interrupt  Service  Routine  (ISR).  At 
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this  moment,  KDLX  is  stalled  and  ESSD  saves  the  error  syndrome  to  memory  through 
Reconciler.  Then  Interrup  generates  a  TRAP  instruction  to  KDLX  and  leads  the  whole 
circuit  into  an  error  correction  condition.  When  KDLX  sees  the  TRAP  instruction,  it 
jumps  to  a  specific  memory  location  and  the  program  counter  value  before  the  jump  is 
saved  in  an  interrupt  address  register  (lAR),  a  special  register  inside  KDLX.  In  the  error 
correction  condition,  the  contents  of  all  registers  inside  KDLX  are  saved  to  memory 
through  voters.  Then,  each  register  is  reloaded  from  memory.  The  purpose  for  doing  this 
step  is  to  correct  any  inconsistencies  of  the  registers  in  all  three  KDLX  processors.  Since 
all  contents  have  to  pass  voters  while  saving,  any  error  inside  any  register  will  be  cor¬ 
rected. 

The  last  instruction  in  ISR  is  Return  From  Exception  (REE).  This  instruction  in¬ 
dicates  the  end  of  ISR  and  the  program  counter  saved  in  lAR  will  be  loaded  back  to  the 
KDEX.  The  logic  gate  set  at  the  bottom  in  Eigure  5  is  a  simple  encoder  of  the  REE  in¬ 
struction  which  tells  Interrup  to  stop  the  ISR.  Einally,  the  whole  circuit  goes  back  to  its 
normal  operation. 

This  circuit  primitively  illustrated  the  complexity  of  the  design  and  was  built 
based  on  theory.  Simulations  and  timing  problems  were  left  as  follow-on  research.  It 
was  proved  on  software  that  with  such  huge  circuit  built  inside,  the  XCV800  EPGA  still 
had  a  plenty  of  space  and  I/O  blocks  available. 

D,  CHAPTER  SUMMARY 

This  chapter  introduces  work  done  by  previous  graduate  students  to  give  a  direc¬ 
tion  where  other  resources  are.  This  thesis  mainly  focuses  on  the  TMR  design  and  fol¬ 
lows  concepts  in  Eashomb  and  Johnson’s  research.  The  primitive  design  has  been  done 
and  general  concepts  have  been  given.  The  Interrup  takes  over  the  whole  circuit  when  an 
error  occurs.  Specific  locations  in  memory  are  reserved  for  ISR  and  storing  error  syn¬ 
dromes.  No  other  instructions  should  be  able  to  access  these  locations. 

In  the  next  chapter,  the  testing  environment  and  ISE  software  are  introduced.  De¬ 
veloping  a  consistent  testing  environment  is  important  in  order  to  have  the  right  compari¬ 
son.  A  description  of  software  tools  is  also  often  useful  information  for  a  reader.  This 
helps  people  understand  more  about  simulation. 
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Figure  5.  Full  TMR  Design  Schematie  (From  Ref.  [5].) 
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III.  TESTING  ENVIRONMENT  AND  ISE  SOFTWARE 


It  is  hard  to  build  a  circuit  without  simulating  it  since  that  is  the  eheapest  and  fast¬ 
est  way  to  verify  if  a  design  works  or  not.  The  software  used  for  simulation  and  the  one 
used  for  constructing  circuits  do  not  need  to  be  made  by  the  same  company.  Different 
programs  may  use  different  ways  to  eompile  code  or  run  simulations.  A  circuit  built  via 
some  speeifie  functions  offered  in  one  program  may  not  fit  into  other  programs.  There¬ 
fore,  a  designer  using  programs  made  by  different  persons  or  companies  sometimes  face 
the  problem  of  incompatibility.  This  issue  ean  be  solved  if  a  package  of  service  is 
bought.  Generally  speaking,  products  made  by  the  same  company  are  more  compatible 
with  each  other  and  it  is  easier  for  that  company  to  provide  complete  customer  serviees. 

Simulation  is  a  very  important  component  of  design.  A  good  design  without  a 
proper  simulation  may  have  degraded  performance  or  effieiency.  Sometimes  inaccurate 
simulation  results  ean  mislead  a  designer  into  modifying  something  whieh  is  not  sup¬ 
posed  to  be  modified.  A  good  simulation  result  eould  not  only  prove  one’s  design  but 
also  help  others  understand  the  coneept  one  embodies  in  a  design.  In  terms  of  thesis  re¬ 
search,  simulation  helps  the  designer  and  others  to  verify  the  design  without  spending  too 
much  time.  Follow-on  students  can  simply  rerun  the  program  and  prove  the  consistency. 

All  settings  of  test  benehes  for  simulations  will  be  offered  in  this  thesis.  This  kind 
of  information  is  usually  not  available  on  a  lot  of  testing  or  simulation.  Providing  the 
simulation  result  without  providing  parameters  means  that  others  may  not  be  able  to  un¬ 
derstand  the  testing  backgrounds  and  may  prevent  people  from  building  an  identical  test 
beneh.  This  is  not  important  for  a  reader  on  the  web,  but  it  is  important  for  a  graduate 
student  working  on  a  thesis.  First,  a  program  sometimes  crashes  and  files  will  be  lost  for 
some  reasons  which  means  someone  may  never  get  the  same  simulation  outputs.  Seeond, 
a  modified  cireuit  sometimes  needs  a  new  test  bench  for  it.  Without  those  parameters, 
simulation  will  be  done  under  different  testing  environments  and  performance  improve¬ 
ment  may  not  be  proved. 
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A.  COMPUTER  SPECIFICATIONS 

System  performance  is  often  an  important  factor  for  testing.  Running  a  program 
on  a  slow  machine  takes  longer  time  than  on  a  fast  machine  but  the  program  result  should 
be  the  same.  When  considering  timing  issues,  performance  of  a  system  can  be  an  impor¬ 
tant  role.  A  slow  computer  basically  cannot  handle  large  amount  of  data  and  sometimes 
forces  a  user  to  reboot.  As  the  TMR  design  gets  more  complicated,  simulation  will  take 
longer  for  sure.  The  speed  of  how  many  data  per  second  that  a  system  can  handle  may 
affect  the  accuracy  of  simulation.  Specifications  of  testing  environment  are  always  stated 
in  a  lot  of  computer  magazines  especially  when  testing  a  new  hardware  performance. 

The  TMR  design  so  far  is  not  so  complicated  that  it  needs  a  high  performance  computer 
to  simulate  it.  The  information  offered  in  Table  3  can  be  used  as  a  reference  in  future 
thesis  work. 


Model 

IBM  ThinkPad  A3 1  (2652Q5U) 

Processor 

Pentium® 2  4  2.0  GHz 

Memory 

I  GB  PC2I00  DDR  SDRAM 

Hard  Drive 

40  GB  4200  RPM 

Operating  System 

Windows  2000  Professional 

OS  version 

5.0  Service  Pack  3 

Video  Card 

Mobility  Radeon  7500  AGP 

Table  3.  Computer  Specifications  for  Simulation 


B,  XILINX ISE  SOFTWARE 

The  software  used  for  constructing  TMR  design  is  a  package  called  ISE  made  by 
Xihnx®3,  one  of  the  largest  FPGA  manufactures  in  the  world.  This  software  is  available 
at  NPS  and  is  used  in  labs  for  some  courses.  Students  who  want  to  do  FPGA  design 
should  have  basic  understanding  of  this  program.  In  order  to  do  this  research,  it  was  nec¬ 
essary  to  learn  about  ISE  and  its  associated  simulator  from  the  Xilinx  website  [10],  an  in- 
depth  tutorial  [11]  or  personal  experience. 


2  Pentium  is  a  registered  trademark  of  Intel  Corporation. 

3  Xilinx  is  a  registered  trademark  of  Xilinx  Corporation. 
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ISE  5.2.03i  was  the  version  used  for  this  thesis.  Project  Navigator  was  the  overall 
controller  of  the  ISE  design  system.  The  other  important  program  used  in  this  thesis 
called  ModelSim®4  is  a  powerful  simulation  tool.  Its  full  version  name  is  ModelSim  XE 
II  5.6e.  Logos  of  Project  Navigator  and  ModelSim  are  shown  in  Eigures  6  and  7. 

XILINX 

DESIGN  SOLU 
Project  Navigator 

Release  Version:  5.2.03i 
Application  Version:  build+F-3 Id- 
Registration  ID:  135766720478 
Copyright  (c)  1935-2002  XilinX;,  Inc. 

All  rights  reserved. 

Eigure  6.  Xilinx  ISE  Project  Navigator  Logo 


Model 

ModelSim  XE  II  5.6e 

^  Copyright  Model  Technology  2002 


Model  Technology 

Eigure  7.  Xilinx  ISE  ModelSim  Logo 

The  EPGA  selected  for  CETP  was  a  Xilinx  Virtex  XCV800  hq  240  with  speed 
grade  of -4.  This  is  an  EPGA  with  800  gate  equivalents,  in  a  package  with  240  pins. 
Thus  using  ISE  to  develop  and  simulate  the  TMR  design  should  be  able  to  achieve  the 
best  design  and  the  most  realistic  simulation  of  any  other  programs. 

While  this  research  was  being  performed,  Xilinx  released  a  new  version  of  ISE 
6.1i  to  its  customers.  Xilinx  has  warned  that  loading  a  project  made  in  an  old  version  of 

4  ModelSim  is  a  registered  trademark  of  Mentor  Graphies  Corporation. 
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ISE  into  ISE  6.1i  will  make  an  unrecoverable  change  and  the  project  can  no  longer  be 
read  by  older  ISE  software.  Since  a  lot  of  simulations  have  been  done  at  this  moment  and 
in  order  to  keep  the  consistency  of  all  testing  environment,  simulation  on  the  latest  ver¬ 
sion  is  left  as  a  part  of  future  work. 

C.  CHAPTER  SUMMARY 

This  chapter  summarized  hardware  and  software  information  along  with  simula¬ 
tion  environment.  Simulation  may  look  different  in  different  software  versions  and 
sometimes  new  error  will  be  generated.  Undiscovered  errors  or  potential  defects  of  a  de¬ 
sign  may  be  pointed  out  in  the  new  version  software.  Sometimes  the  difference  between 
new  and  old  program  is  described  in  the  user  guide  or  on  company’s  website.  It  is  good 
to  know  primary  evolution  on  new  software  and  expect  changes  on  old  design.  Work  be¬ 
comes  efficient  if  one  can  exploit  a  program’s  features  and  functions. 

Components  in  TMR  design  will  be  introduced  in  following  chapters.  Before 
constructing  a  full  design,  each  circuit  is  built  and  tested.  Therefore,  simulation  results 
will  be  used  to  explain  how  a  circuit  functions. 
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IV.  KDLX  INTRODUCTION 


The  KDLX,  a  16-bit  processor,  is  the  kernel  of  this  TMR  design.  Each  compo¬ 
nent  in  the  design  is  connected  with  a  KDLX  processor  and  tested  as  the  final  procedure. 
The  KDLX  is  the  soft-core  processor  to  be  used  for  each  of  the  three  processors  in  the 
design  of  the  TMR  system  as  shown  in  Ligure  3.  Due  to  the  features  of  the  KDLX  pipe¬ 
line  and  wiring  delays,  a  circuit  that  works  in  a  test  bench  by  itself  sometimes  does  not 
work  with  a  KDLX.  Knowing  KDLX  helps  a  designer  foresee  problems  when  building  a 
circuit  with  it.  Therefore,  understanding  KDLX  is  the  first  step  for  constructing  a  TMR 
design. 

A.  INSIDE  KDLX 

The  KDLX  is  coded  in  VHDL,  VHSIC  (Very  High  Speed  Integrated  Circuit) 
Hardware  Description  Language.  It  is  composed  of  two  top-level  blocks,  core  and 
lOPads,  as  shown  in  Ligure  8.  The  core  and  lOPads  are  names  of  blocks;  corel  and 
10  Pads  1  are  local  block  names  representing  core  and  10  Pads,  respectively,  in  the 
VHDL  file  called  “dlx.vhd”.  The  word  KDLX  at  the  top  right  corner  is  the  name  of  the 
outer  block.  Numbers  next  to  input  and  output  pins  represent  the  width  of  the  bus. 

Words  in  bright  green  are  local  signals  and  none  of  the  interconnections  between  these 
local  pins  are  accessible  from  the  outside  (e.g.,  the  connection  between  In  Data  on 
lO  Padsl  and  Input_data  on  corel).  All  pins  on  the  left  side  are  input  signals  and  all 
pins  on  the  right  side  are  output  signals,  except  the  Data  bus.  Controlled  by  10  Pads  1, 
the  data  bus  on  KDLX  is  bi-directional.  It  sends  out  data  when  writing  to  memory  and 
stays  high  impedance  otherwise.  High  impedance  allows  other  devices  connected  on  the 
data  bus  to  drive  the  bus,  but  data  will  not  be  accepted  by  KDLX  at  this  moment  even  if  it 
flows  inbound.  The  dash  line  in  sky  blue  inside  10  Pads  1  is  an  internal  connection.  This 
internal  connection  functions  only  when  input  signal  Out  En  n  is  low. 

Notice  that  most  input  and  output  pins  of  KDLX  are  the  same  as  corel.  The  func¬ 
tion  of  10  Pads  1  is  to  interface  the  external  bi-directional  data  bus  to  input  data  and  out¬ 
put  data  buses  on  corel.  To  understand  KDLX  better,  the  core  needs  to  be  explored. 
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Figure  8.  Inside  KDLX 

Major  funetional  bloeks  are  all  inside  core  and  are  shown  in  Figure  9.  These 
bloeks  are  zero  Jest,  pipeline,  regfile,  pc  control,  rwjontrol,  alu,  wordjegjingle, 
word  mux3  and  word  mux4. 


20 


Figure  9.  Inside  core 
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The  local  block  name  used  in  the  file  “core.vhd”  is  boxed  at  the  top  of  each  func¬ 
tion  block.  Words  in  bright  green  are  still  local  signals  and  those  in  sky  blue  represent 
global  signals  only  within  the  core.  They  are  considered  global  signals  because  most 
blocks  have  these  signals  and  they  all  receive  the  same  value.  For  instance,  all  blocks  re¬ 
ceive  zero  when  signal  resetn  is  low.  When  the  global  signals  Shift_En  is  low,  local 
block  pipeline  l  may  invert  this  signal  to  high  internally  and  use  it  to  trigger  other  func¬ 
tions.  Therefore,  Shift _En  low  in  the  core  does  not  mean  this  signal  is  low  inside  pipe¬ 
line  _1.  That  is  why  global  signals  are  used  for  the  core  only. 

The  detailed  functioning  of  each  block  is  described  in  KDLX’s  VHDL  code.  Fig¬ 
ures  8  and  9  are  plotted  directly  from  the  original  VHDL  code  to  illustrate  how  these 
components  connect.  Functions  of  important  components  like  alu,  regfile,  pcjcontrol, 
rwjcontrol  and  pipeline  are  briefed  here.  Simulation  of  KDLX  later  will  verify  these 
functions. 

1.  Function  of  fl/a 

This  block  is  able  to  do  addition,  logic  computation,  and  barrel  shifting.  Subtrac¬ 
tion  can  be  achieved  by  adding  a  positive  number  with  a  negative  number.  KDLX  uses 
2’s  complement  arithmetic  to  do  calculation.  A  simple  8-bit  2’s  complement  number  ta¬ 
ble  is  shown  in  Table  4. 


Binary  number 

Equivalent  Decimal  number 

1 

1 

1 

1 

1 

1 

1 

1 

127 

0 

0 

0 
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0 

1 

1 

3 

0 

0 

0 

0 

0 

0 

1 

0 

2 

0 

0 

0 

0 

0 

0 

0 

1 

1 

0 

0 

0 

0 

0 

0 

0 

0 

0 

1 

1 

1 

1 

1 

1 

1 

1 

-1 

1 

1 

1 

1 

1 

1 

1 

0 

-2 

1 

1 

1 

1 

1 

1 

0 

1 

-3 

1 

1 

1 

1 

1 

0 

1 

1 

-4 

1 

0 

0 

0 

0 

0 

0 

0 

-128 

Table  4.  2’s  Complement  Numbers 
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Logic  computation  includes  logic  AND,  OR  and  XOR  functions.  KDLX  allows  a 
user  to  do  logie  eomputation  between  eontents  of  two  registers  or  the  eontents  of  a  regis¬ 
ter  and  an  immediate  value. 

A  built-in  barrel  shifter  gives  KDLX  the  ability  to  do  logie  or  arithmetic  shifting. 

2,  Function  of  reg/i/e 

All  15  registers  of  KDLX  are  in  this  bloek.  The  inbound  data  bus  is  eonneeted  to 
all  registers  and  an  enable  bus  is  used  to  control  which  register  is  being  written.  Two  big 
muxes,  MUXA  and  MUXB,  route  the  output  of  a  selected  register  to  the  outbound  data 
bus. 

3.  Function  of /?c_co«tra/ 

The  program  counter  sends  the  address  to  the  instruction  memory  in  order  to  feteh 
an  instruetion  for  next  step.  The  pc  control  assumes  an  important  role  while  exeeuting  a 
Braneh,  Jump  or  TRAP  instruetion.  For  some  instruetions  like  Jump  and  Link, 
pc  control  will  save  the  return  address  of  the  instruetion  that  comes  after  the  next  2  in¬ 
structions.  This  is  because  KDLX  is  pipelined,  and,  therefore,  two  instructions  after  the 
Jump  will  be  exeeuted  before  the  jump  oecurs.  The  return  address  is  saved  in  register  15. 
Since  no  instruction  in  KDLX  is  able  to  read  the  return  address  in  register  15  directly, 
another  circuit  needs  to  be  constructed  in  order  to  jump  back  to  where  the  Jump  and  Link 
instruetion  left  off 

Another  important  eomponent  in  pc  control  is  the  interrupt  address  register  (lAR) 
which  has  been  mentioned  in  Johnson’s  implementation.  lAR  is  a  register  not  accessible 
for  a  user.  This  speeial  register  is  merely  used  to  save  the  return  address  of  the  TRAP  in¬ 
struction.  When  the  TRAP  instruction  is  executed,  the  return  address  (whieh  is  the  ad¬ 
dress  right  after  the  next  2  instructions)  is  saved  into  the  lAR.  After  this,  the  program 
eounter  jumps  to  another  memory  loeation  and  start  reading  another  set  of  instruetions. 
Another  instruetion  named  Return  From  Exeeption  (RFE)  will  be  at  the  end  of  the  in¬ 
struetion  set.  REE  will  read  the  lAR  and  jump  back  to  the  memory  loeation  indieated. 
The  jump,  braneh  and  trap  implementations  will  be  diseussed  again  while  simulating 
KDEX  in  this  ehapter. 
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4. 


Function  of  riv  control 


Obviously  this  is  where  KDLX  eontrols  read,  write  and  program  read  signals  for 
the  memory  modules  that  are  attaehed  to  it.  An  important  point  here  is  that  the  KDLX 
read  and  write  signals  are  aetive  low.  This  means  these  two  signals  are  aetivated  at  the 
falling  edge  of  elock. 

5.  ¥ nnction  of  pipeline 

Inheriting  the  nature  of  DLX,  the  KDLX  is  a  five-stage  pipelined  proeessor,  i.e., 
Feteh,  Deeode,  Execute,  Memory  and  Write  Back.  At  the  Decode  stage,  signals  used  to 
select  registers  in  regfile  are  assigned.  At  the  Execute  stage,  eight  instructions  are  spe¬ 
cific  monitored.  These  eight  instructions  are  Jump,  Jump  and  Eink,  Branch  if  Equal 
Zero,  Branch  if  Not  Equal  Zero,  REE,  TRAP,  Jump  Register  and  Jump  Register  and 
Eink.  At  the  Memory  stage,  the  signals  are  generated  to  allow  the  KDEX  to  read  from  or 
write  to  memory.  The  last  stage.  Write  Back  stage,  allows  most  of  the  instructions  to 
write  to  registers  except  some  specific  ones. 

6.  KDLX  Summary 

Thankfully,  the  ISE  software  has  the  ability  to  transfer  VHDE  code  to  a  schematic 
so  the  user  has  an  option  to  study  a  circuit  without  understanding  VHDE  code.  The 
Schematic  is  more  graphical  than  code  and  allows  people  to  physically  see  how  circuit  is 
wired.  The  schematic  symbol  of  KDEX  is  shown  in  Eigure  10. 


dix 


Eigure  10.  Schematic  Symbol  of  KDEX 
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a.  Inputs  and  Outputs 

As  mentioned  earlier,  KDLX  has  four  inputs,  five  outputs  and  one  bi- 
direetion  bus.  Four  inputs  are  three  1-bit  pins,  i.e.,  clock Jn,  resetn  and  stalln,  and  one 
24-bit  instruction  bus.  Five  outputs  are  three  1-bit  pins,  i.Q.,prog_rd,  rd  and  wr,  and  two 
16-bit  buses,  i.e.,  addr_int(15:0)  and pc(15:0).  The  only  bi-directional  bus  is  a  16-bit 


data  bus.  Functions  of  these  pins  are  listed  in  Table  5. 


Symbol 

Signal  Name 

Function 

clock  in 

Clock  input 

resetn 

Reset 

Reset  KDLX  when  low.  All  register  contents  are 
cleared. 

stalln 

Stall 

Stall  KDLX  when  low.  Stall  everything  including 
data  in  pipeline  stage. 

instr(23:0) 

Instruction  Bus 

Receive  instructions  sent  from  instruction  memory. 

prog  rd 

Program  Read 

rd 

Read 

Read  data  from  data  memory  when  low. 

wr 

Write 

Write  data  to  data  memory  when  low. 

addr  int(15:0) 

Data  Address 

Send  data  address  to  data  memory. 

pc(15;0) 

Program  Counter 

Send  instruction  address  to  instruction  memory. 

data(15;0) 

Data  Bus 

Receive  data  from  data  memory  or  send  data  out  to 
data  memory. 

Table  5.  1 

mnction  of  Pins  on  KDLX 

b.  Harvard  Architecture  and  Von  Neumann  Architecture 

KDLX  is  a  Harvard  architecture  device  that  has  a  pair  of  address  and  data 
buses  for  instruction  memory  and  another  pair  for  data  memory.  Figure  1 1  illustrates  the 
concept  of  this  architecture.  The  device  at  the  center  sends  the  address  of  instruction  to 
an  instruction  memory.  Then  the  instruction  memory  on  the  left  will  send  an  instruction 
back  to  the  device.  If  the  instruction  received  is  to  read  or  write  data  to  data  memory,  the 
device  at  the  center  will  send  a  data  address  to  the  data  memory  at  the  right  side  to  indi¬ 
cated  the  memory  location  it  wants  to  read  or  write.  If  the  device  wants  to  read,  the  data 
bus  will  be  driven  by  data  memory  and  data  is  sent  from  data  memory  to  the  device.  If 
the  device  wants  to  write,  the  data  bus  will  be  driven  by  the  device  and  data  is  sent  from 
the  device  to  data  memory. 
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Figure  1 1 .  Harvard  Architecture 


By  applying  the  same  concept  to  KDLX,  a  picture  like  Figure  12  is  under¬ 
standable. 


Instruction 
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Figure  12.  KDLX  Connections  with  Two  Memories 


The  Von  Neumann  architecture,  on  the  other  hand,  has  only  one  address 
bus  and  one  data  bus.  A  single  memory  is  used  in  this  architecture.  A  processor  using 
Von  Neumann  architecture  has  less  timing  issues  that  need  to  be  solved  with  memory 
since  they  are  the  same  architecture.  A  Harvard-architecture  processor,  e.g.,  KDLX, 
needs  to  deal  with  possible  timing  mismatches  with  memory  if  only  one  memory  is  avail¬ 
able.  In  the  CFTP  design,  only  one  memory  is  available  for  the  TMR  circuit  thus  it  is  an 
instruction  memory  and  a  data  memory  as  well.  Recall  that  a  component  in  Johnson’s 
implementation  (called  Reconciler)  is  such  a  device  used  to  integrate  these  two  different 
architectures. 

In  order  to  consolidate  a  four-bus  processor  with  a  two-bus  memory,  the 
memory  has  to  run  in  double  speed  to  support  two  accesses  per  clock  cycle.  Figure  13 
shows  how  KDLX  communicates  with  only  one  memory. 


Figure  13.  KDLX  with  One  Memory 
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Since  KDLX  is  a  pipelined  proeessor,  it  needs  to  be  able  to  read  or  write 
data  at  the  time  it  fetches  an  instruetion.  Both  of  these  events  ean  happen  in  one  KDLX 
eloek  eyele.  If  the  memory  is  twiee  as  fast  as  the  KDLX,  it  is  able  to  deal  with  instruc¬ 
tion  at  the  first  memory  eloek  cycle  and  deal  with  data  at  the  seeond  memory  eloek  cyele, 
In  Figure  13,  pc(15:0)  and  instr(23:0)  are  done  in  the  first  memory  eloek  eyele; 
addr_int(15:0)  and  data(15:0)  are  done  in  the  second  memory  eloek  cycle.  The  memory 
used  here  needs  to  be  a  24-bit  memory  due  to  the  width  of  instruction  bus.  Because  the 
KDLX  data  bus  is  only  16-bits  wide,  only  the  lower  16-bit  data  will  be  accepted  and  the 
rest  are  buffered  out. 

B,  PIPELINE  CONCEPTS 

The  KDLX  is  a  five-stage  pipelined  proeessor.  These  five  stages  are  Fetch,  De¬ 
code,  Exeeute,  Memory  (Mem)  and  Write  Back  (WB).  When  doing  a  write,  data  is  writ¬ 
ten  to  a  register  at  the  third  eloek  eyele,  i.e.,  the  Exeeute  stage.  Therefore,  a  destination 
register  used  in  one  instruction  is  not  available  until  2  eloek  cyeles  later.  This  eoncept 
has  signifieant  impacts  when  ereating  a  test  beneh.  Eigure  14  shows  the  pipeline  exeeu- 
tion  of  KDEX  in  normal  operation. 


Instruction 

number 

I  2 

3 

4 

Clock  cycle 
5 

6 

7 

8 

9 

Instruction  I 

Fetch  Decode 

Execute 

Mem 

WB 

Instruction  2 

Fetch 

Decode 

Execute 

Mem 

WB 

Instruction  3 

Fetch 

Decode 

Execute 

Mem 

WB 

Instruction  4 

Fetch 

Decode 

Execute 

Mem 

WB 

Instruction  5 

Fetch 

Decode 

Execute 

Mem 

WB 

Eigure  14.  Pipeline  Exeeution  in  KDEX 


In  Eigure  14,  if  Instruction  I  is  loading  data  from  the  memory  to  register  3  (for 
example),  the  action  to  load  register  3  starts  at  eloek  3  and  ends  at  clock  5  which  means 
register  3  should  not  be  aeeessed  as  a  souree  register  in  Instruetion  2,  3  and  4.  bailing  to 
do  so,  Instruetion  2,  3  and  4  will  either  feteh  a  wrong  value  or  unidentified  data.  Thus  a 
new  value  of  register  3  is  only  available  for  an  instruetion  equivalent  to  or  later  than  In¬ 
struction  5. 
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c. 


MEMORY  IN  SIMULATION 


All  components  generated  for  TMR  design  were  simulated  with  KDLX  and  mem¬ 
ory  as  the  final  step.  The  ISE  software  has  several  different  kinds  of  RAM  or  ROM  in 
schematics  for  users  to  choose.  A  designer  ean  also  construet  a  memory  via  VHDL  code. 
Another  function  called  the  CORE  generator  (Coregen)  is  a  graphical  interactive  design 
tool  in  ISE  software  to  help  a  user  design  a  module.  Due  to  its  simplieity,  memory  used 
in  this  thesis  was  generated  from  Coregen. 

A  24-bit  memory  with  its  simulation  result  is  shown  in  Appendix  A,  seetion  A.  In 
order  to  explain,  a  eopy  of  this  simulation  was  made  and  labeled  as  Eigure  15. 
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Eigure  15.  24-bit  Memory  Simulation  Result 

Values  on  the  address  bus  and  input  data  bus  are  assigned  in  the  test  beneh.  In 
this  simulation,  memory  is  being  written  at  point  1.  The  first  value  (i.e.,  OOOOdVie)  is 
written  into  memory  loeation  OOie  and  the  seeond  value  (i.e.,  OOOOdCie)  is  written  into 
memory  loeation  01  le  and  so  on.  At  point  2,  memory  starts  being  read  and  all  values  are 
output  as  originally  initiated.  One  of  the  features  of  this  memory  is  that  data  sent  to 
datajn  bus  for  writing  comes  out  at  the  data_out  bus.  A  designer  can  monitor  the  data 
written  into  memory  from  here.  The  write  enable  signal  of  this  memory  is  active  low; 
therefore  it  reads  when  this  signal  is  high. 

Memory  used  in  simulation  can  be  a  RAM  or  ROM.  A  ROM  is  used  as  an  in¬ 
struction  memory  which  is  not  allowed  to  be  written.  A  RAM  can  be  initialized  by  writ- 
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ing  it  before  using  it,  but  a  ROM  cannot  since  it  does  not  have  a  write  enable  pin.  Thus,  a 
ROM  needs  to  be  pre-configured.  In  the  ISE  software,  a  user  needs  to  generate  a  coe  file 
and  load  it  before  a  memory  is  generated  in  Coregen. 

Memory  offered  in  ISE  software  is  not  a  real  Von  Neumann  architecture  since  it 
has  separate  buses  for  data  input  and  output.  Eor  simplicity,  the  TMR  design  in  this  the¬ 
sis  uses  this  kind  of  memory.  Further  modification  is  needed  when  a  real  Von  Neumann 
architecture  memory  is  available. 

D.  KDLX  SIMULATION  WITHOUT  MEMORY 

Operation  codes  (Opcodes)  for  the  instruction  set  are  described  in  Appendix  B. 
This  appendix  includes  all  instructions  that  can  be  implemented  in  KDLX.  Simulation  of 
all  instructions  is  one  of  the  best  ways  to  understand  how  KDLX  functions.  Before  doing 
that,  a  simple  simulation  on  KDLX  itself  is  shown  in  Appendix  A,  section  B.  Figure  16 
is  a  copy  of  this  simulation  result  for  explanation.  All  registers  in  the  KDLX  are  initial¬ 
ized  to  the  value  0000 1 6  and  register  0  is  always  zero. 
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Figure  16.  KDLX  Simulation 


In  Figure  16,  the  first  instruction  at  point  I  represents  loading  the  value  at  mem¬ 
ory  location  [(register  0)+05]  into  register  3.  One  can  find  a  read  signal  becomes  low  at 
point  2.  Comparing  the  timing  here  with  Figure  14,  it  is  proved  that  the  action  on  the  reg¬ 
ister  occurs  at  Execute  clock  cycle.  Since  two  values,  OOldie  and  OOlSie,  are  already 
available  on  the  bus,  KDLX  loads  these  two  data  into  register  3  and  register  5,  respec- 
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tively.  Recall  that  the  pipeline  features  discussed  in  Figure  14,  the  new  content  of  regis¬ 
ter  5  is  not  available  at  any  clock  cycle  before  point  3.  Using  register  5  anywhere  before 
point  3  will  use  the  old  value  in  register  5  which  is  OOOOie  in  this  case.  In  this  simulation, 
three  NOP  are  inserted  before  using  register  5. 

At  point  3,  instruction  450507i6  stands  for  storing  the  content  of  register  5  to  the 
memory  location  [(register  0)+07] .  Again,  the  action  starts  at  point  4  which  is  the  Exe¬ 
cute  cycle  for  this  instruction  and  the  value  loaded  before  shows  up  on  the  data  bus. 

Since  the  data  bus  is  high  impedance  at  this  clock  cycle,  the  KDLX  is  able  to  drive  the 
bus  and  output  data.  Without  a  high  impedance,  the  KDLX  is  not  able  to  use  the  bus  be¬ 
cause  it  assumes  someone  is  using  it.  By  checking  the  address  bus  of  the  KDLX  simula¬ 
tion,  one  can  find  how  the  instruction  and  address  correspond  with  each  other. 

The  two  instructions  following  the  store  instructions  are  413408i6  and  415601 16. 
These  add  immediate  values  to  register  3  and  5,  respectively,  thus  the  data  inside  register 
3  and  5  changes.  This  can  be  seen  at  point  5  when  these  two  register  contents  are  stored 
again. 

Lor  the  rest  of  this  thesis,  we  will  use  assembly  language  mnemonics  to  refer  to 
instructions.  Lor  example,  a  register  is  represented  by  R.  Thus,  RO  stands  for  register  0 
and  R1  means  register  1.  Instead  of  a  long  explanation  of  each  instruction,  the  operation 
symbol  will  also  be  used  in  following  contents.  An  instruction  like  440305  le  will  be  rep¬ 
resented  as  LW  R3<— Mem(R0+05).  The  symbols  and  expressions  are  defined  in  Appen¬ 
dix  B. 

E,  KDLX  SIMULATION  WITH  MEMORY 

There  are  a  total  of  42  instructions  for  KDLX.  Understanding  these  instructions  is 
necessary  to  generate  a  test  bench  for  the  TMR  processor.  Utilizing  different  combina¬ 
tions  of  instructions  can  also  help  a  designer  use  a  short  test  bench  to  achieve  the  same 
goal  of  simulation.  Instead  of  loading  a  large  number  of  instructions  into  instruction 
memory  before  testing,  pre-configured  memory  is  used.  Simply  by  selecting  a  different 
memory  file,  the  same  test  bench  can  be  used  to  test  different  instruction  set;  otherwise, 
several  test  benches  are  needed  for  different  instruction  set. 
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Instead  of  testing  all  instructions  in  one  huge  test  bench,  the  42  instructions  were 
separated  into  four  different  instruction  sets.  Instruction  set  1  and  2  test  arithmetic  and 
logic  functions.  Instruction  set  3  and  4  test  Jump,  Branch  and  TRAP  functions. 

The  schematic  designed  for  this  testing  is  shown  in  Figure  17.  Memory  at  left 
side  is  a  ROM  used  as  instruction  memory.  The  other  one  at  right  side  is  data  memory 
which  is  a  RAM.  The  addrjbox  contains  only  buffers  used  to  truncate  the  width  of  the 
address  bus  since  the  memory  address  for  this  design  is  only  8-bits  wide.  Data  memory 
is  pre-configured  with  0003 16  since  some  numbers  need  to  be  loaded  into  registers  at  the 
beginning  of  simulation. 
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Figure  17.  KLXD  with  Instruction  and  Data  Memory 
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The  write  signal  on  KDLX  is  conneeted  direetly  to  data  memory  in  order  to  be 
able  to  write  memory.  Since  KDLX  uses  a  bi-directional  data  bus,  buffers  with  enable 
pin  are  needed  to  control  the  direction  of  data  flow.  Read  and  write  signals  are  used  to 
enable  or  disable  these  buffers.  Extra  output  buses  are  added  for  monitor  purposes.  All 
test  benches  and  simulation  results  are  in  Appendix  A,  section  C. 

1.  Implementation  Table  of  Instruction  Set  1 

An  implementation  table  is  generated  as  Table  6.  Constructing  such  an  instruc¬ 
tion  test  bench  can  take  a  lot  of  time  since  instructions  need  to  be  rearranged  and  simula¬ 
tion  results  need  to  be  checked.  Instructions  tested  in  each  set  are  not  many,  but  a  num¬ 
ber  of  loading  and  storing  instructions  are  needed  to  check  the  data.  All  numbers  in  Ta¬ 


ble  6  are  hexadecimal  and  RO  is  always  zero. 


Instruction  (operation  symbol) 

Opcode 

Value  through  Data  Bus 

LW 

Rl^Mem(R0+03) 

440103 

SW 

Rl^Mem(R0+08) 

450108 

0003 

LW 

R2^Mem(R0+04) 

440204 

SW 

R2^Mem(R0+09) 

450209 

0003 

ADD 

R1+R2^R3 

011320 

SW 

R3^Mem(R0+0D) 

45030D 

0006 

ADDI 

Rl+ext(E9)^R4 

4114E9 

SW 

R4^Mem(R0+0E) 

45040E 

EEEC 

ADDUI 

R1+(0A)  ^R5 

21150A 

SW 

R5^Mem(R0+0E) 

45050E 

OOOD 

AND 

R1*R3^R6 

091630 

SW 

R6^Mem(R0+10) 

450610 

0002 

ANDI 

R4*(ED)^R7 

2947ED 

SW 

R7^Mem(R0+ll) 

450711 

OOEC 

LHI 

R8^EE||(0)* 

0808EE 

SW 

R8^Mem(R0+12) 

450812 

EEOO 

OR 

R1+R3^R9 

0A1930 

SW 

R9^Mem(R0+13) 

450913 

0007 

ORI 

R1+(E0)^R10 

2A1AE0 

SW 

R10^Mem(R0+14) 

450A14 

00E3 

SEQ 

R1=R2^R11=1 

181B20 

SW 

Rll^Mem(R0+15) 

450B15 

0001 

SEQ 

R1^R3^R12=0 

181C30 

SW 

R12^Mem(R0+16) 

450C16 

0000 

SEQI 

R1=(0003)^R13=1 

581D03 

SW 

R13^Mem(R0+17) 

450D17 

0001 

SEQI 

R1K0004)^R14=0 

581E04 
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Instruction  (operation  symbol) 

Opcode 

Value  through  Data  Bus 

SW 

R14^Mem(R0+18) 

450E18 

0000 

SLL 

R4^R^^(ooo3)^R15 

114E20 

SW 

R15^Mem(R0+19) 

450E19 

EEEO 

SLLI 

R4^(ooo5)^R3 

514305 

SW 

R3^Mem(R0+lA) 

45031 A 

EE80 

SRA 

R4^Ri^(ooo3)^R5 

134510 

SW 

R5^Mem(R0+lB) 

45051B 

EEEE 

SRLI 

R4^(ooo3)^R6 

524603 

SW 

R6^Mem(R0+lC) 

45061C 

IEEE 

SUBI 

R8-ext(7B)^R7 

43877B 

SW 

R7^Mem(R0+lD) 

45071D 

EE85 

XOR 

R9©R10^R11 

0B9BA0 

SW 

Rll^Mem(RO+lE) 

450B1E 

00E4 

Table  6. 

Instruction  Set  1 

There  are  four  sections  in  this  map.  Instructions  for  loading  or  computing  data 
are  implemented  first  in  each  section.  Instructions  for  storing  are  used  for  checking  data 
and  are  implemented  later.  The  third  column  lists  all  Opcodes  for  implementing  and  the 
fourth  column  shows  all  data  that  should  come  out  on  the  data  bus. 

2,  Simulation  Result  of  Instruction  Set  1 

To  see  the  difference  with  the  simulation  of  KDLX  only,  part  of  the  simulation 
results  is  shown  in  Figure  18. 


Figure  18.  Simulation  of  KDLX  with  Memory 
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In  order  to  make  sure  that  the  memory  is  stable  before  KDLX  is  going  to  use  it, 
the  memory  eloek  cyele  is  doubled.  The  instruetion  memory  will  be  ready  before  KDLX 
reads  the  instruetion.  The  data  memory  will  write  data  in  a  very  short  time  and  always  be 
ready  to  be  read  by  the  KDLX. 

Comparing  timing  before  and  after  KDLX  conneets  with  the  memory,  a  delay  of 
the  read  and  write  operation  can  be  found.  In  Figure  18,  the  instruction  at  point  1  does 
not  start  the  write  until  point  2.  Without  the  memory,  this  signal  should  be  about  one-half 
clock  cycle  earlier  than  point  2.  This  difference  is  due  to  the  timing  delays  from  the 
connecting  memory.  The  fourth  cycle  of  the  KDLX  clock  is  Mem  which  means  that  the 
KDLX  is  accessing  memory  at  this  time. 

Another  delay  shows  on  instruction  fetching.  (Recall  the  schematic  in  Figure  17.) 
The  program  counter  of  KDLX  sends  out  an  instruction  address  to  the  instruction  mem¬ 
ory.  Then  the  instruction  memory  reads  the  program  counter  and  sends  out  an  instruction 
to  KDLX.  This  delay  makes  each  instruction  in  Figure  18  start  at  the  falling  edge  of 
clock.  This  is  not  like  the  instruction  in  Figure  16  which  starts  at  the  rising  edge.  The 
same  delay  happens  when  KDLX  reads  from  or  writes  to  the  data  memory. 

The  pipeline  feature  can  also  be  seen  in  Figure  18.  While  KDLX  is  still  sending 
out  data,  it  is  simultaneously  fetching  a  new  instruction. 

An  alternative  way  to  check  the  simulation  result  is  to  construct  tables  for  memo¬ 
ries  and  registers  as  shown  in  Table  7.  The  instruction  memory  is  pre-configured  as  the 
first  table  at  the  left.  The  second  table  shows  how  the  contents  of  registers  change  in  the 
simulation.  The  third  table  at  the  right  expresses  values  in  different  locations  after  the 
simulation  is  done.  Blank  areas  in  data  memory  will  contain  the  default  value  0003  le. 

In  the  instruction  memory,  a  series  of  store  instructions  is  used  to  check  the  con¬ 
tents  in  registers.  A  series  of  load  instructions  is  used  to  check  the  contents  in  the  mem¬ 
ory  locations.  The  first  six  Opcodes  implement  the  instructions  in  section  1  of  Table  6. 
Then  the  Opcodes  from  memory  locations  08  to  10  execute  the  instructions  in  section  2 
of  Table  6.  All  instructions  for  loading  and  computation  are  executed  before  storing  to 
memory.  The  instruction  sequence  in  Table  6  is  used  to  track  which  part  of  the  instruc¬ 
tions  are  checked  when  storing. 
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Instruction  Mem 

00 

2D 

45071 D 

01 

440103 

2E 

450B1E 

02 

440204 

2F 

000000 

03 

000000 

30 

000000 

04 

000000 

31 

000000 

05 

450108 

32 

450101 

06 

450209 

33 

450201 

07 

000000 

34 

450301 

08 

011320 

35 

450401 

09 

4114F9 

36 

450501 

OA 

21150A 

37 

450601 

OB 

000000 

38 

450701 

OC 

091630 

39 

450801 

OD 

45030D 

3A 

450901 

OE 

45040E 

3B 

450A01 

OF 

45050F 

3C 

450B01 

10 

450610 

3D 

450C01 

11 

2947FD 

3E 

450D01 

12 

0808FF 

3F 

450E01 

13 

0A1930 

40 

450F01 

14 

2A1AF0 

41 

000000 

15 

450711 

42 

000000 

16 

450812 

43 

000000 

17 

450913 

44 

4401 OD 

18 

450A14 

45 

44020E 

19 

181B20 

46 

44030F 

1A 

181C30 

47 

440410 

IB 

581 D03 

48 

440511 

1C 

581 E04 

49 

440612 

ID 

450B15 

4A 

440713 

IE 

450C16 

4B 

440814 

IF 

450D17 

4C 

440915 

20 

450E18 

4D 

440A16 

21 

114F20 

4E 

440B17 

22 

514305 

4F 

440C18 

23 

134510 

50 

440D19 

24 

524603 

51 

440E1A 

25 

450F19 

52 

440F1B 

26 

45031A 

53 

4401 1C 

27 

45051 B 

54 

44021 D 

28 

45061 C 

55 

44031 E 

29 

43877B 

56 

000000 

2A 

0B9BA0 

57 

000000 

2B 

000000 

58 

000000 

2C 

000000 

59 

000000 

Register 

00 

01 

0003 

02 

0003 

03 

000© 

FF80 

04 

FFFC 

05 

OOOO 

FFFF 

06 

0002 

1FFF 

07 

OOFG 

FE85 

08 

FFOO 

09 

0007 

10 

00F3 

11 

0004 

00F4 

12 

0000 

13 

0001 

14 

0000 

15 

FFEO 

Data  Mem 

00 

01 

02 

03 

04 

05 

06 

07 

08 

0003 

09 

0003 

OA 

OB 

OC 

OD 

0006 

OE 

FFFC 

OF 

OOOD 

10 

0002 

11 

OOFC 

12 

FFOO 

13 

0007 

14 

00F3 

15 

0001 

16 

0000 

17 

0001 

18 

0000 

19 

FFEO 

1A 

FF80 

IB 

FFFF 

1C 

1FFFF 

ID 

FE85 

IE 

00F4 

IF 

20 

21 

22 

23 

24 

25 

26 

27 

28 

29 

2A 

Table  7.  Tables  of  Registers  and  Memories  in  Simulation  1 
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The  Opcode,  41 14F9i6,  at  memory  location  09i6  implements  ADDl 
Rl+ext(F9)^R4.  The  original  value  of  R1  is  OOOSie  which  equals  to  3io.  Since  KDLX 
uses  2’s  complement  numbers,  the  sign  extension  value  of  F9i6  is  FFF9i6  which  is  (-7)  in 
decimal.  The  sum  of  3  lo  and  (-7)io  is  (-4)io.  Convert  (-4)io  to  a  binary  number  and  do 
2’s  complement,  the  result  in  hexadecimal  is  FFFCie.  This  agrees  with  the  value  in  data 
memory  location  OEie. 

3.  Implementation  Table  of  Instruction  Set  2 

The  rest  of  the  instructions  (not  including  Jump  and  Branch)  are  listed  in  Table  8. 
This  table  only  shows  the  instructions  that  were  tested  in  this  thesis.  The  table  does  not 
include  the  instructions  for  configuring  memory  contents.  This  will  be  explained  further 


in  the  simulation  section  of  this  chapter. 


Instruction  (operation  symbol) 

Opcode 

Expected  Value 

SGE 

R1>R3^R13=1 

191D30 

SW 

R13^Mem(R0+lE) 

450D1E 

0001 

SGE 

R15>R14^R9=0 

19E9E0 

SW 

R9^Mem(R0+20) 

450920 

0000 

SGEI 

R15>ext(E8)^R10=0 

59EAE8 

SW 

R10^Mem(R0+21) 

450A21 

0000 

SGEI 

R15>ext(E0)  ^Rll=l 

59EBE0 

SW 

Rll^Mem(R0+22) 

450B22 

0001 

SGT 

R4>R15^R6=1 

1A46E0 

SW 

R6^Mem(R0+23) 

450623 

0001 

SGT 

R15>R4^R7=0 

1AE740 

SW 

R7^Mem(R0+24) 

450724 

0000 

SGTI 

R15>ext(EE)^R8=0 

5AE8EE 

SW 

R8^Mem(R0+25) 

450825 

0000 

SGTI 

R15>ext(87)^R9=l 

5AE987 

SW 

R9^Mem(R0+26) 

450926 

0001 

SEE 

R1=R2^R10=1 

1B1A20 

SW 

R10^Mem(R0+27) 

450A27 

0001 

SEE 

R1<R13^R11=0 

IBIBDO 

SW 

Rll^Mem(R0+28) 

450B28 

0000 

SEEI 

Rl<ext(03)^R12=l 

5B1C03 

SW 

R12^Mem(R0+29) 

450C29 

0001 

SEEI 

Rl<ext(02)^R13=0 

5B1D02 

SW 

R13^Mem(R0+2A) 

450D2A 

0000 

SET 

R15<R1^R6=1 

1CE610 

SW 

R6^Mem(R0+01) 

450601 

0001 

SET 

R1<R15^R7=0 

1C16E0 
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Instruction  (operation  symbol) 

Opcode 

Expected  Value 

SW 

R7^Mem(R0+02) 

450702 

0000 

SLTI 

Rl<ext(0D)^R8=l 

5C180D 

SW 

R8^Mem(R0+03) 

450803 

0001 

SLTI 

Rl<ext(01)^R9=0 

5C1901 

SW 

R9^Mem(R0+04) 

450904 

0000 

SNE 

R1^R2^R10=0 

1D1A20 

SW 

R10^Mem(R0+05) 

450A05 

0000 

SNE 

R1^R15^R11=1 

IDIBEO 

SW 

Rll^Mem(R0+06) 

450B06 

0001 

SNEI 

Rl^ext(03)^R12=l 

581C03 

SW 

R12^Mem(R0+07) 

450C07 

0001 

SNEI 

R15^ext(El)^R13=0 

58EDE1 

SW 

R13^Mem(R0+08) 

450D08 

0000 

SRAI 

R3^(ooo6)^R6 

533606 

SW 

R6^Mem(R0+09) 

450609 

EEEE 

SRE 

R3^R^^(ooo3)^R7 

123720 

SW 

R7^Mem(R0+0A) 

45070A 

lEEO 

XORI 

R15©(8A)^R8 

2BE88A 

SW 

R8^Mem(R0+0B) 

45080B 

EE6A 

SUBUI 

R3-(80)^R9 

233980 

SW 

R9^Mem(R0+0C) 

45090C 

EEOO 

SUB 

R1-R3^R14 

031E30 

SW 

R14^Mem(R0+0D) 

450E0D 

0083 

Table  8. 

Instruction  Set  2 

4.  Simulation  Result  of  Instruction  Set  2 

The  complete  table  set  that  shows  all  values  inside  memories  and  registers  for  this 
simulation  is  shown  in  Table  9.  In  the  instruction  memory  part  of  the  table,  the  instruc¬ 
tions  shown  in  Table  8  actually  start  at  memory  location  2Ai6.  Instructions  before  this 
point  are  used  to  generate  the  same  register  values  used  in  instruction  set  1 .  The  first  col¬ 
umn  of  Table  9  shows  values  that  are  identical  to  the  final  results  in  Table  7. 

The  registers  change  many  times  during  this  simulation,  but  the  table  only  shows 
the  initial  and  final  values.  The  first  column  as  described  in  the  last  paragraph  is  the 
starting  data  for  instruction  set  2.  The  second  column  lists  all  final  values  in  registers. 

This  simulation  uses  different  data  memory  locations  than  instruction  set  1 .  This 
provides  a  boundary  test  for  memory  while  testing  KDLX. 
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This  instruction  set  demonstrates  most  of  the  possible  comparisons  between  regis¬ 
ters  or  of  a  register  with  an  immediate  value.  Since  the  KDLX  uses  2’s  eomplement  val¬ 
ues,  0003 16  is  obviously  greater  than  FF8O16.  Logical  operations  like  ANDI,  ORI,  and 
XORI  do  not  use  sign  extension  on  an  immediate  value. 
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Instruction  Mem 

00 

30 

450A21 

01 

410103 

31 

450B22 

02 

410203 

32 

1A46F0 

03 

0803FF 

33 

1AF740 

04 

0804FF 

34 

5AF8FF 

05 

0805FF 

35 

5AF987 

06 

08061 F 

36 

450623 

07 

410380 

37 

450724 

08 

4104FC 

38 

450825 

09 

4105FF 

39 

450926 

OA 

2166FF 

3A 

1B1A20 

OB 

0807FE 

3B 

1B1BD0 

OC 

0808FF 

3C 

5B1C03 

OD 

080FFF 

3D 

5B1D02 

OE 

210AF3 

3E 

450A27 

OF 

217785 

3F 

450B28 

10 

210BF4 

40 

450C29 

11 

410907 

41 

450D2A 

12 

410D01 

42 

1CF610 

13 

410E00 

43 

1C17F0 

14 

410C00 

44 

5C180D 

15 

410FE0 

45 

5C1901 

16 

000000 

46 

450601 

17 

000000 

47 

450702 

18 

450100 

48 

450803 

19 

450200 

49 

450904 

1A 

450300 

4A 

1D1A20 

IB 

450400 

4B 

1D1BF0 

1C 

450500 

4C 

581 C03 

ID 

450600 

4D 

58FDE1 

IE 

450700 

4E 

450A05 

IF 

450800 

4F 

450B06 

20 

450900 

50 

450C07 

21 

450A00 

51 

450D08 

22 

450B00 

52 

533603 

23 

450C00 

53 

123720 

24 

450D00 

54 

2BF88A 

25 

450E00 

55 

233980 

26 

450F00 

56 

031 E30 

27 

000000 

57 

450609 

28 

000000 

58 

45070A 

29 

000000 

59 

45080B 

2A 

191D30 

5A 

45090C 

2B 

19F9E0 

5B 

450E0D 

2C 

59FAE8 

5C 

000000 

2D 

59FBE0 

5D 

000000 

2E 

450D1F 

5E 

000000 

2F 

450920 

5F 

000000 

Register 

00 

01 

0003 

0003 

02 

0003 

0003 

03 

FF80 

FF80 

04 

FFFC 

FFFC 

05 

FFFF 

FFFF 

06 

1FFF 

FFFE 

07 

FE85 

1FF0 

08 

FFOO 

FF6A 

09 

0007 

FFOO 

10 

00F3 

0000 

11 

00  F4 

0001 

12 

0000 

0001 

13 

0001 

0000 

14 

0000 

0083 

15 

FFEO 

FFEO 

Data  Mem 

00 

01 

0001 

02 

0000 

03 

0001 

04 

0000 

05 

0000 

06 

0001 

07 

0001 

08 

0000 

09 

FFFE 

OA 

1FF0 

OB 

FF6A 

OC 

FFOO 

OD 

0083 

OE 

OF 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

1A 

IB 

1C 

ID 

IE 

IF 

0001 

20 

0000 

21 

0000 

22 

0001 

23 

0001 

24 

0000 

25 

0000 

26 

0001 

27 

0001 

28 

0000 

29 

0001 

2A 

0000 

Table  9.  Tables  of  Registers  and  Memories  in  Simulation  2 
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5. 


Implementation  Table  of  Instruction  Set  3 


This  instruction  set  starts  by  testing  the  Jump  and  Braneh  instructions.  The  eom- 
plete  implementation  is  listed  in  Table  10.  There  are  no  divisions  in  this  table  and  the  se¬ 
quence  of  execution  is  from  top  to  bottom.  If  an  instruction  jumps  to  the  wrong  memory 
loeation,  one  or  all  contents  of  the  target  registers  will  not  agree  with  the  expeeted  value 
shown  here. 


Instruction  (operation  symbol) 

Opcode 

Expected  Value 

LW 

Rl^Mem(R0+03) 

410103 

LW 

R2^Mem(R0+04) 

410204 

LW 

R3^Mem(R0+00) 

410300 

LW 

R4<— Mem(R0+06) 

410406 

BNEZ 

R1  ;^0^Prog_Addr<— (05)+ 1  +ext(04) 
Note;  PC=05  and  (05)+l+ext(04)=0A 

CO 1004 

BEQZ 

R3=0^Prog_Addr<— (0A)+ 1 +ext(04) 
Note;  PC=0A  and  (0A)+l+ext(04)=0P 

C13004 

ADDI 

R0+ext(25)^R5 

410525 

J 

(0020)^Prog_Addr 

C80020 

JAE 

(0014)^Prog_Addr ;  (23)^R15 
Note;(23)  is  return  address 

E80014 

ADDI 

R0+ext(8A)^R6 

4 1068 A 

ADDI 

R0+ext(40)^R7 

410740 

ADD 

R1+R2^R8 

011820 

ADD 

R1+R4^R9 

011940 

SW 

R15^Mem(R0+01) 

450F01 

0023 

JALR 

R5^Prog_Addr ;  (1D)^R15 
Noter;(lD)  is  return  address 

685000 

J 

(0030)^Prog_Addr 

C80030 

SW 

R5^Mem(R0+02) 

450502 

0025 

SW 

R6^Mem(R0+03) 

450603 

EE8A 

SW 

R7^Mem(R0+04) 

450704 

0040 

SW 

R8^Mem(R0+05) 

450805 

0007 

SW 

R9^Mem(R0+06) 

450906 

0009 

SW 

R15^Mem(R0+07) 

450E07 

OOID 

JR 

R7^Prog_Addr 

487000 

SW 

R2^Mem(R0+08) 

450208 

0004 

Table  10.  Instruction  Set  3 
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6. 


Simulation  Result  of  Instruction  Set  3 


For  Jump  and  Branch  instructions,  the  sequence  of  instructions  in  memory  is  not 
the  sequence  of  implementation.  This  ean  be  easily  understood  by  looking  at  Table  1 1 . 

The  black  arrows  represent  the  normal  sequence  of  operation.  The  blue  dash  lines 
stand  for  Jump  or  Braneh  instructions  without  link,  and  the  blue  solid  lines  stand  for 
Jump  and  Link  or  Branch  and  Link. 

The  first  branch  occurs  at  memory  location  05 16.  Since  the  program  counter  at 
that  point  is  05 16,  it  branehes  to  memory  loeation  OAie  with  a  given  immediate  value  04i6. 
The  action  of  branching  occurs  two  clocks  later  due  to  pipelining,  so  the  instructions  at 
memory  loeation  06i6  and  OVie  are  fetched  before  the  sequence  branehes  to  the  new  ad¬ 
dress. 

At  memory  loeation  OAie,  another  branch  instruction  is  executed.  It  branehes  to 
another  memory  loeation,  OFie.  Beeause  the  Opeode  41 0525  le  is  fetched  before  the 
braneh  occurs,  an  immediate  value  is  added  into  R5.  This  can  be  checked  in  the  register 
table  or  in  data  memory  location  02i6  where  Opeode  450502i6  loads  data  to. 

Opcode  E80014i6  is  a  Jump  and  Link  instruction.  It  jumps  to  address  I4i6  and 
save  address  23  le  into  RI5.  There  is  no  doubt  that  address  23 16  is  where  the  jump  occurs, 
not  address  20i6,  21  le  or  22i6.  In  eaeh  ease,  the  two  instructions  following  Jump  and  Link 
are  fetched  before  the  jump  instruction  is  executed. 

The  instruction  at  memory  loeation  lAie  is  Jump  Register  and  Link.  This  allows 
KDLX  to  read  the  address  it  wishes  to  jump  to  directly  from  its  internal  register.  Sup¬ 
pose  one  register  is  reserved  for  a  special  purpose  and  it  contains  a  special  memory  loca¬ 
tion.  Then  KDLX  ean  always  jump  to  that  speeifie  memory  location  by  simply  reading 
the  eontents  of  that  register  without  any  extra  instruetions  needing  to  be  implemented. 
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Data  Mem 

00 

01 

0023 

02 

0025 

03 

FF8A 

04 

0040 

05 

0007 

06 

0009 

07 

001  D 

08 

0004 

09 

OA 

OB 

OC 

OD 

OE 

OF 

1  0 

1  1 

1  2 

1  3 

1  4 

1  5 

1  6 

1  7 

1  8 

1  9 

1  A 

1  B 

1  C 

1  D 

1  E 

1  F 

20 

21 

22 

23 

24 

25 

26 

27 

28 

29 

2A 

Register 

00 

0  1 

0003 

02 

0004 

03 

0000 

04 

0006 

05 

0025 

06 

FF8A 

07 

0040 

08 

0007 

09 

0009 

1  0 

1  1 

1  2 

1  3 

1  4 

1  5 

1  n  stru  ctio  n  M  e  m 

00 

r 

0  1 

410103 

u 

1 

02 

410204 

u 

03 

410300 

u 

1  ' 

04 

410406 

..u 

05 

C  0 1 004 

U 

06 

000000 

u. 

07 

000000 

08 

09 

OA 

C 1 3004 

U 

OB 

410525 

OC 

000000 

OD 

OE 

1  ' 

OF 

C  80020 

U 

1  0 

000000 

u 

1  1 

000000 

1  2 

1  3 

r 

1  4 

011820 

1 

1  5 

011940 

u 

1 

1  6 

450F01 

u 

n 

1  7 

000000 

U 

1 

1  8 

000000 

U 

1 

1  9 

000000 

1 

1  A 

685000 

u 

1  B 

000000 

1  C 

000000 

1  D 

1  E 

1  F 

20 

E  800 1  4 

u 

1 

2  1 

4  1  068A 

22 

410740 

23 

24 

25 

C  80030 

U 

1  ' 

26 

000000 

27 

000000 

28 

29 

2A 

30 

450502 

U 

3  1 

450603 

U 

32 

450704 

1  ' 

33 

450805 

n 

34 

450906 

U 

35 

450F07 

..U 

1 

36 

487000 

u 

37 

000000 

38 

000000 

39 

40 

450208 

u 

4  1 

000000 

42 

000000 

L 

43 

000000 

Table  1 1 .  Tables  of  Registers  and  Memories  in  Simulation  3 


43 


7. 


Implementation  Table  of  Instruction  Set  4 


This  instruction  set  contains  one  of  the  most  eomplicated  instructions  in  the  TMR 
design,  whieh  is  the  TRAP  instruction.  The  TRAP  instruetion  aets  as  Jump  and  Link  or 
Braneh  and  Link.  The  differenee  is  that  it  saves  its  return  address  into  the  lAR,  not  into 
R15.  The  lAR  is  a  specific  register  mentioned  earlier  when  introducing  the  pc  control 
inside  KDLX.  Storing  the  return  address  into  the  lAR  not  only  saves  a  register  but  also 
guarantees  the  integrity  since  it  is  only  aeeessible  for  the  TRAP  instruction. 

Another  feature  of  the  TRAP  instruction  is  that  it  owns  an  instruction  called  Re¬ 
turn  from  Exception  (REE).  The  REE,  Opeode  E8OOOO16,  only  reads  the  content  of  lAR 
and  jumps  to  that  address.  Since  the  lAR  always  contains  the  return  address  of  the  TRAP 
instruction,  the  REE  instruction  only  works  with  the  TRAP  instruetion. 

Instruction  set  4  for  testing  the  TRAP  instruetion  is  shown  in  Table  12. 


Instruction  (operation  symbol) 

Opcode 

Expected  Value 

ADDI 

R0+ext(04)^Rl 

410104 

ADDI 

R0+ext(07)^R2 

410207 

TRAP 

(0020)^Prog_Addr ;  (06)^IAR 
Note;  (06)  is  return  address 

280020 

ADDI 

R0+ext(09)^R3 

410309 

ADDI 

R0+ext(15)^R4 

410415 

ADDI 

R0+ext(0A)^R7 

41070A 

ADDI 

R0+ext(ll)^R8 

410811 

ADDI 

R0+ext(C2)^R10 

410AC2 

REE 

(06)^Prog  Addr 

Note:  (06)  is  lAR 

E80000 

J 

(001  l)^Prog_Addr 

C80011 

SW 

Rl^Mem(R0+01) 

450101 

0004 

sw 

R2^Mem(R0+02) 

450202 

0007 

SW 

R3^Mem(R0+03) 

450303 

0009 

sw 

R4^Mem(R0+04) 

450404 

0015 

sw 

R7^Mem(R0+07) 

450707 

OOOA 

sw 

R8^Mem(R0+08) 

450808 

0011 

sw 

R10^Mem(R0+0A) 

450A0A 

EEC2 

Table  12.  Instruction  Set  4 
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8. 


Simulation  Result  of  Instruction  Set  4 


The  features  of  the  TRAP  instruetion  are  shown  in  Table  13.  When  fetehing  the 
TRAP  instruction  at  memory  location  03 1 6,  KDLX  stores  the  return  address  06 1 6  to  the 
lAR.  Two  clock  cycles  later  in  the  TRAP,  the  program  counter  changes  to  20i6  and  reads 
the  instruction  at  that  address.  After  implementing  a  few  instructions,  the  KDLX  sees  the 
Opcode  F80000  16  and  retrieves  address  06i6  for  the  return.  The  content  at  location  06  is  a 
Jump  instruction.  Therefore,  the  KDLX  jumps  again  to  memory  location  1 1  le. 

Some  important  features  can  be  found  in  this  implementation.  First,  the  TRAP 
occurs  exactly  after  2  clock  cycles;  otherwise  the  Opcode  C8001 1 16  will  be  fetched.  Sec¬ 
ond,  the  lAR  is  not  directly  addressable,  so  using  Opcode  F8OOOO16  is  the  only  way  to 
verify  the  content  of  the  lAR.  Third,  instruction  set  4  can  be  an  infinite  loop  if  the  test 
bench  never  stops.  After  jumping  to  memory  location  1 1 16,  the  program  counter  keeps 
counting  in  order  to  read  instructions.  If  no  other  signal  stops  the  KDLX,  it  will  read  Op¬ 
code  F8OOOO16  again.  This  retrieves  the  lAR  and  jumps  back  to  memory  location  O616. 
The  Opcode  C8001 1  le  will  lead  KDLX  to  jumping  to  address  1 1  le  then  to  keep  on  read¬ 
ing  instructions  until  it  hits  F8OOOO16  again.  This  loop  can  be  observed  in  the  full  simula¬ 
tion  result  for  instruction  set  4  in  Appendix  A,  section  C. 
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►lAR 


Instruction  Mem 

00 

01 

410104 

02 

410207 

03 

280020 

04 

410309 

05 

410415 

06 

C80011 

07 

000000 

08 

000000 

09 

OA 

OB 

OC 

OD 

OE 

OF 

10 

11 

450101 

12 

450202 

13 

450303 

14 

450404 

15 

450707 

16 

450808 

17 

450A0A 

18 

000000 

19 

000000 

1A 

000000 

IB 

1C 

ID 

IE 

IF 

20 

41070A 

21 

410811 

22 

410AC2 

23 

000000 

24 

000000 

25 

000000 

26 

F80000 

27 

000000 

28 

000000 

29 

2A 

<4-- 


Register 

00 

01 

0004 

02 

0007 

03 

0009 

04 

0015 

05 

06 

07 

OOOA 

08 

0011 

09 

10 

FFC2 

11 

12 

13 

14 

15 

Table  13.  Tables  of  Registers  and  Memories  in  Simulation  4 
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F.  CHAPTER  SUMMARY 

This  chapter  introduced  several  important  components  inside  KDLX  and  dis¬ 
cussed  pipeline  concepts.  Drawing  a  schematic  from  VHDL  code  is  a  good  way  to  un¬ 
derstand  KDLX. 

The  simulation  of  KDLX  with  and  without  memory  illustrated  the  concept  of  the 
pipeline  and  developed  ideas  on  how  to  organize  a  test  bench.  Most  of  the  tables  neces¬ 
sary  for  simulation  purposes  were  generated  in  this  chapter.  Having  the  tables  generated 
before  constructing  a  test  bench  helps  a  designer  to  understand  what  the  goal  is  and  how 
to  achieve  it.  Tables  created  by  the  simulation  gives  a  designer  a  big  picture  on  how 
things  interact  with  each  other.  Sometimes  things  are  hard  to  say  but  easy  to  see. 

The  TMR  Assembly  is  designed  in  the  next  chapter.  The  function  of  the  voter 
and  how  it  corrects  an  error  will  be  explained.  Then  we  will  combine  three  KDLX  proc¬ 
essors  with  voters  to  form  a  TMR  Assembly.  Important  simulation  concepts  will  be  re¬ 
viewed  as  well. 
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V.  TMR  ASSEMBLY 


The  TMR  Assembly  is  composed  of  three  KDLX  processors  with  voters  on  all 
outputs.  All  of  the  KDLX  instructions  have  been  tested  in  the  simulation  described  in  the 
previous  chapter  and  the  fundamental  concept  of  KDLX  has  been  established.  The  next 
step  is  to  realize  the  function  of  a  voter. 

A  voter  is  constructed  by  some  simple  logic  gates  and  is  able  to  find  an  error 
when  inputs  are  not  consistent.  Since  the  CFTP  will  be  operating  in  a  relatively  benign 
LEO  orbit,  the  TMR  design  does  not  have  to  deal  with  too  many  errors  per  unit  time. 

The  assumption  of  the  TMR  design  is  that  we  will  not  see  identical  errors  on  two  proces¬ 
sors  at  the  same  time.  The  voters  pass  the  majority  vote  so,  if  the  errors  are  identical, 
they  will  not  be  detected  (and  will,  in  fact,  be  turned  into  truth.) 

A,  1-BIT  VOTER 

The  CFTP  is  designed  to  be  fault  tolerant  by  software.  Its  circuit  needs  to  be  able 
to  detect  an  error  and  correct  the  error  by  itself.  In  order  to  achieve  that,  the  concept  of  a 
voter  is  generated. 

The  function  of  a  I -bit  voter  has  been  introduced  in  Lashomb’s  thesis  [I].  This 
section  reviews  the  basic  concepts  and  then  starts  constructing  the  TMR  Assembly.  Fig¬ 
ure  19  shows  what  a  1-bit  voter  looks  like.  It  is  a  simple  circuit  consisting  of  only  AND 
and  OR  gates. 


AND2 


F igure  19.  1  -Bit  Maj  ority  V oter  (After  Ref  [  1  ] .) 
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The  voter  funetion  is  more  obvious  in  the  truth  table  shown  in  Table  14.  This 
voter  always  selects  the  majority  of  identical  bits  as  its  output  bit.  If  two  or  more  inputs 
are  incorrect,  the  voter  output  will  also  be  incorrect.  The  ability  to  detect  and  correct  two 
or  more  errors  in  a  voter  is  not  vital  for  a  system  (e.g.,  the  CFTP)  in  LEO  orbit. 


A 

B 

c 

Y 

0 

0 

0 

0 

0 

0 

1 

0 

0 

1 

0 

0 

0 

1 

1 

1 

1 

0 

0 

0 

1 

0 

1 

1 

1 

1 

0 

1 

1 

1 

1 

1 

Table  14.  Truth  Table  of  A 


-Bit  Voter  (From  Ref  [1].) 


Assuming  a  single  error,  the  output  is  always  correct,  but  we  cannot  tell  if  there 
has  been  an  error  just  by  looking  at  this  output.  Therefore,  some  extra  gates  are  added  to 
report  the  occurrence  of  an  error.  Figure  20  shows  a  voter  with  error  detection  and  Table 
15  is  its  truth  table. 
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F igure  20 .  V oter  with  Error  Detection  (After  Ref  [  1  ] .) 
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A 

B 

c 

Y 

ERR 

0 

0 

0 
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0 
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0 

1 

0 

1 

0 

0 

1 

0 

1 

1 

1 

1 

1 

0 

0 

0 

1 

1 

0 

1 

1 

1 

1 

1 

0 

1 

1 

1 

1 

1 

1 

0 

Table  15.  Truth  Table  of  Voter  with  Error  Detection  (From  Ref  [1].) 

The  error  detection,  ERR,  is  1  when  one  of  the  inputs  is  not  identical  with  the  rest. 
When  the  CFTP  is  in  space,  it  is  possible  to  have  an  SET!  on  the  voter  itself.  A  bit  flip 
may  cause  the  voter  output  to  be  incorrect.  Say  the  second  column  of  Table  15  has  a  bit 
flipping  on  A.  This  flipping  makes  1  become  the  majority  bit  and  output  Y  will  give  a  1 
not  a  0.  Since  a  voter  is  used  to  catch  and  correct  an  error,  it  is  not  pleasant  if  it  has  an 
error  itself  Thus,  some  reliability  is  needed  for  the  voter.  A  voter  with  added  reliability 
is  shown  in  Figure  21. 


i:c:> 


/ 

OR3 


Figure  21 .  Voter  with  Added  Reliability  (After  Ref.  [1].) 


This  version  is  built  by  duplicating  the  original  part  of  the  voter  and  XORing  the 
two  parts  to  generate  a  voter  error  detection,  V  ERR.  If  the  voter  errors,  the  outputs  of 
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the  two  OR3  in  Figure  21  will  not  agree  with  eaeh  other,  and  V  ERR  beeomes  1.  Table 
16  is  the  truth  table  of  this  cireuit. 


A 
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Y 

V  ERR 
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0 
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0 

0 

1 

1 

1 
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1 
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0 

0 

0 

1 

0 

1 

1 

0 

1 

1 

0 

1 

0 

1 

1 

1 

1 

0 

Table  16.  Truth  Table  of  Voter  with  Added  reliability  (From  Ref.  [1].) 


The  last  step  is  to  eolleet  all  of  these  pieees  to  eonstruet  a  eomplete  single-bit 
voter.  As  introdueed  earlier,  a  voter  with  error  detection  is  able  to  correct  the  error  and 
tell  the  user  an  error  has  occurred.  For  the  TMR  design,  knowing  the  existence  of  an  er¬ 
ror  is  not  good  enough  since  the  error  also  has  to  be  corrected.  In  order  to  correct  the  er¬ 
ror,  the  faulty  input  may  needs  to  be  identified.  With  all  these  considerations,  a  complete 
circuit  is  generated  as  shown  in  Figure  22.  The  truth  table  for  this  circuit  is  Table  17. 


Figure  22.  Complete  Majority  Voter  (After  Ref.  [1].) 
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New  signals  CID  O  and  CID  l  are  used  to  identify  the  faulty  input,  with  CID  O 
representing  the  least  signifieant  bit.  Using  the  third  row  of  the  table  as  an  example,  the 
voter  should  be  able  to  eapture  the  error  and  identify  the  faulty  input  pin.  The  output  sig¬ 
nal  7  is  a  0  and  D  ERR,  error  deteetion,  reports  a  1 .  This  indieates  that  one  of  input  sig¬ 
nals  is  not  eonsistent  and  the  eorreet  input  signal  is  0.  Furthermore,  CID  l  and  CID  O 
show  1  and  0,  respeetively,  whieh  means  the  seeond  proeessor  is  faulty.  Sinee  T  is  0  and 
the  seeond  input  is  faulty,  it  ean  be  eoneluded  that  input  B  has  an  error  and  its  value  is  1 . 

The  sehematie  of  the  eomplete  majority  voter  built  in  ISE  is  shown  in  Figure  23. 
All  input  and  output  pins  are  1-bit  wide. 


A 

Y 

V_ERR 

B 

CID_0 

CID_1 

C 

D  ERR 

Figure  23.  Sehematie  Symbol  of  1-Bit  Majority  Voter 

B,  16-BIT  VOTER 

Sinee  KDLX  has  16-bit  output  buses,  16-bit  voters  are  needed  in  order  to  vote 
every  bit  on  these  buses.  A  16-bit  voter  is  simply  eomposed  of  sixteen  1-bit  voters  as 
shown  in  Figure  24.  All  voters  vote  in  parallel  and  produee  five  output  buses  for  five  dif¬ 
ferent  signals,  Y,  V  ERR,  CID  O,  CID  l,  a.ndD_ERR. 
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Sixteen  1-Bit  Voters 


Figure  25  is  the  sehematie  symbol  used  in  ISE.  The  signal  name  D  ERR  is 
ehanged  to  ERR  in  order  to  simplify  the  notation. 


Y(15:0) 

— 

A(15:0) 

V_ERR(15:0) 

— 

B(15:0)  CID_0(15:0) 

CID_1(15:0) 

C(15:0) 

ERR(15:0) 

Figure  25.  Sehematie  Symbol  of  16-Bit  Voter 

The  voter  performs  an  important  role  in  TMR.  It  is  the  deviee  to  eateh  and  report 
errors.  The  CFTP  in  spaee  ean  have  an  SEU  oecur  anywhere  in  the  EPGA.  If  the  error  is 
eaught  by  the  voter,  it  will  be  eorreeted.  If  the  voter  votes  ineorrectly,  it  will  be  eaught 
by  the  voter  error  deteetion  eireuitry.  The  problem  beeomes  more  eomplieated  if  an  error 
oeeurs  on  the  voter  error  deteetion.  If  the  voter  voted  wrong  but  the  error  deteetion  did 
not  eateh  it,  the  error  may  propagate  through  the  system  and  eorrupt  the  data.  A  new  eir- 
euit  ean  be  added  to  deteet  error  deteetion,  but  adding  gates  inereases  the  probability  of 
an  error  and  also  inereases  the  eomplexity.  Making  a  voter  that  has  aeceptable  reliability 
without  inereasing  the  probability  of  an  SEU  too  mueh  is  difficult. 

C.  TMR  ASSEMBLY  WITHOUT  MEMORY 

The  concept  of  the  TMR  is  to  triplicate  processors  and  vote  all  output  signals  to 
get  correct  values.  An  even  number  of  processors  cannot  use  majority  voters.  Eive  or 
more  processors  will  increase  the  circuit  size  dramatically.  As  described  earlier,  this  in¬ 
creases  the  probability  of  having  an  error  by  SEU.  The  usual  compromise  is  to  use  three 
processors.  The  TMR  does  not  increase  circuitry  too  much  and  its  efficiency  has  been 
proved  in  some  existing  space  systems. 

In  this  section,  several  different  architectures  will  be  discussed,  which  is  a  good 
chance  to  show  how  things  change  when  different  components  are  used.  Important  learn¬ 
ing  points  will  be  provided  at  the  end  of  this  chapter. 
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1. 


Schematic  and  Simulation  1 


Figure  26  is  the  first  design  of  the  TMR  Assembly  for  this  thesis.  Important  sig¬ 
nals  are  indieated  with  arrows.  The  three  big  bloeks  at  the  left  side  are  KDLX  proees- 
sors.  The  sequenee  from  top  to  bottom  is  proeessor  A,  B  and  C.  The  24-bit  instruetion 
input  buses  are  instr_a(23:0),  instr_b(23:0),  and  instr_c(23:0),  respectively. 

Voters  are  connected  at  the  outputs  of  the  processors.  All  of  the  outputs  are 
voted.  The  first  three  voters  at  the  top  are  1-bit  voters  for  control  signals  and  the  other 
three  are  16-bit  voters  for  buses.  The  voter  at  the  top  is  the  voter  for  the  program  read 
signal.  The  read  signals  for  the  instruction  fetch  of  all  three  processors  are  connected  to 
this  voter  to  be  voted.  The  second  one  is  the  voter  for  data  read  signals  and  the  third  one 
is  for  data  write  signals.  The  three  16-bit  voters  are  for  the  address,  the  program  counter, 
and  the  data  bus,  respectively. 

The  outputs  of  each  voter  are  collected  to  a  bus.  Therefore,  there  are  four  buses 
on  the  right  side.  One  data  bus  is  at  the  output  of  the  data  voter,  named  data _p(15:0). 
Since  each  bus  on  the  right  side  collects  the  outputs  of  six  voters,  each  bus  is  51 -bits 
wide. 

Because  the  data  memory  used  in  the  ISE  has  separate  buses  for  the  input  and  the 
output,  data _p(15:0)  is  generated  as  a  write  bus  and  data_m(15:0)  is  generated  as  a  read 
bus.  The  read  and  write  signals  are  active  low.  Thus,  inverters  are  used  to  enable  buff¬ 
ers.  Without  a  buffer  for  isolation,  data  injected  at  data_m(15:0)  will  be  voted  and  sent 
out  to  data _p(15:0)  which  may  cause  a  bus  conflict. 
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Processor  A  read_p 


Figure  26.  TMR  Assembly 
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This  design  so  far  provides  everything  needed  for  a  TMR  proeessor  based  on  the 
theory  deseribed  in  seetion  B.  The  next  step  was  to  put  it  on  a  simulation  test  beneh  and 
run  it.  The  time  eonstraints  are  50  ns  for  eloek  high  and  low  time  and  10  ns  for  setup  and 
hold  time.  Since  only  one  clock  is  used  in  this  simulation,  the  time  constraints  are  trivial. 
The  simulation  results  are  shown  in  Figures  27  and  28. 
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Figure  27.  TMR  Assembly  Simulation  1-1 


In  Figure  27,  the  datajn  bus  offers  a  series  of  data  regardless  of  whether  the 
instruction  needs  it  or  not.  All  instruction  buses  (i.e.,  instrjx,  instr_b  and  instr_c)  have 
the  same  instruction  at  the  same  time.  The  first  instruction,  LW  Rl<— Mem(R0+04),  is 
fetched  at  point  1 .  It  is  not  executed  until  point  2.  Since  the  read  signal  goes  low  at  point 
2,  it  is  reasonable  to  say  it  loads  data  OOSAie.  Signals  cid_0,  cid_l  and  err  all  report  zero 
because  all  instructions  are  consistent.  Notice  that  the  data  on  the  datajn  bus  changes 
while  read j)  is  still  low.  A  clipping  occurs  at  point  3. 

In  Figure  28,  another  instruction,  SW  RI^Mem(R0+02),  is  fetched.  Since  RI 
had  already  fetched  data  at  point  2,  here  we  expected  to  see  OOSAie  on  the  data _p  bus. 
Unfortunately  this  is  not  the  case  at  point  5.  The  simulation  tells  us  that  KDLX  has  the 
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read  signal  active  low,  but  it  actually  reads  data  at  the  rising  edge.  In  this  simulation,  it 
read  0061  le  at  point  3  not  005 Aie,  as  desired. 
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Figure  28.  TMR  Assembly  Simulation  1-2 


Since  the  processor  reads  at  the  rising  edge,  the  circuit  must  be  able  to  keep  the 
data  stable  to  that  point.  The  simulation  in  Figure  27  shows  that  005Ai6  stays  for  most  of 
the  duration  while  read _p  is  low.  However,  the  bus  changes  to  0061 16  at  the  last  instant, 
which  is  not  a  desirable  situation.  Thus  the  next  step  is  to  modify  the  circuit  to  make  the 
data  stable  through  the  rising  edge  of  read _p.  Figure  29  is  the  modified  design. 

2.  Schematic  and  Simulation  2 

A  16-bit  latch  is  added  to  keep  the  input  data  stable.  With  this  latch,  the  input 
data  only  changes  when  the  read  signal  changes  which  should  in  theory,  provide  a  perfect 
timing  match.  Simulations  of  this  modified  TMR  Assembly  are  shown  in  Figures  30  and 
31. 
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Figure  29.  Modified  TMR  Assembly 
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Figure  30.  Modified  TMR  Assembly  Simulation  2-1 

Points  6  and  7  in  Figure  30  are  identieal  to  points  1  and  2  of  Figure  27.  The  im¬ 
provement  of  the  modified  TMR  Assembly  appears  at  point  8.  The  latched  data  is  still 
available  at  the  point  where  read _p  goes  high  and  all  three  processors  now  read  the  value 
005Ai6.  The  clipping  at  point  3  in  Figure  27  disappears. 
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Figure  3 1 .  Modified  TMR  Assembly  Simulation  2-2 
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Figure  31  continues  the  simulation  to  store  the  content  of  R1  to  memory  location 
02i6  at  point  9.  Following  the  signal  write _p  to  point  10,  one  can  find  that  the  data  on 
data _p  is  005 Aie.  Signals  cid_l,  cid_0,  err  and  v_err  show  that  no  error  is  reported. 

D,  TMR  ASSEMBLY  WITH  MEMORIES 

Since  a  working  TMR  Assembly  has  been  generated,  the  final  step  is  to  hook  it  up 
with  memories.  The  latch  added  in  Figure  29  guarantee  that  the  processors  will  read 
what  they  need  to  read.  The  schematic  symbol  of  the  TMR  Assembly  is  shown  in  Figure 
32.  The  whole  circuit  is  shown  in  Figure  33. 
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Figure  32.  Schematic  Symbol  of  the  Modified  TMR  Assembly 


Many  of  the  signals  in  Figure  33  are  for  the  purpose  of  monitoring  the  simulation. 
As  a  convention,  the  memory  at  the  left  is  the  instruction  memory  and  the  one  at  the  right 
is  the  data  memory.  Two  buffers  are  used  to  control  the  data  flow.  Data  flows  into  the 
data  memory  only  when  the  write  signal  is  low  and  flows  to  the  TMRA  only  when  the 
read  signal  is  low. 

The  instruction  memory  is  pre-configured  with  the  following  Opcodes:  440301 16, 
413406i6,  and  450407i6.  The  first  one  will  load  data  from  memory  location  01 16  to  R3. 
The  second  one  will  add  an  immediate  value  06 1 6  to  R3  and  save  the  result  to  R4.  The  fi¬ 
nal  instruction  will  store  the  content  of  R4  to  memory  location  07 le.  Figure  34  shows  the 
simulation  result. 
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Figure  33.  Modified  TMR  Assembly  with  Memories 
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Figure  34.  Simulation  of  Modified  TMR  Assembly  with  Memories 
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Unfortunately,  no  error  was  reported  but  no  data  was  sent  out  from  the  data  mem¬ 
ory.  If  this  design  worked  eorrectly,  an  output  value  0009i6  should  be  seen  when  the 
TMRA  writes  to  memory.  Obviously,  this  did  not  happen  when  addr_rom  was  OEie. 
Since  no  timing  mismatches  occured  anywhere,  this  design  was  hard  to  debug.  The 
modified  TMR  Assembly  works  fine  without  memories,  so  the  problem  could  have  been 
the  settings  of  this  test  bench.  The  time  constraints  of  this  test  bench  are  listed  in  Table 
18. 


Processors 

Memories 

Clock  High  Time 

50  ns 

Clock  High  Time 

50  ns 

Clock  Fow  Time 

50  ns 

Clock  Fow  Time 

50  ns 

Input  Setup  Time 

10  ns 

Input  Setup  Time 

5  ns 

Output  Valid  Delay 

10  ns 

Output  Valid  Delay 

5  ns 

Time  Offset 

0  ns 

Time  Offset 

0  ns 

Table  18.  Time  Constraints  of  Test  Bench  for  Modified  TMl 

1  Assembly 

Memories  have  less  setup  time  and  hold  time,  so  they  should  be  ready  before  the 
processors  need  their  data.  From  this  point  of  view,  the  test  bench  seemed  not  to  be  the 
problem.  While  the  problem  might  have  been  incompatibility  with  the  choice  of  mem¬ 
ory,  the  next  alternative  approach  was  to  try  the  original  TMR  Assembly  without  the  data 
latch  as  shown  in  Figure  26.  Since  all  input  and  output  signals  are  the  same  with  this 
modified  TMR  Assembly,  the  schematic  and  complete  design  of  the  original  TMR  As¬ 
sembly  are  still  identical  to  Figures  32  and  33,  respectively.  Using  the  same  test  bench 
and  simulation  as  the  first  design  produced  the  result  shown  in  Figure  35. 

This  version  works.  There  is  almost  no  timing  mismatches  and  the  data  clippings 
are  small  enough  to  be  ignored.  This  circuit  sends  out  exactly  the  right  value  after  the 
last  instruction  is  executed.  When  addr_rom  is  at  OEie,  0009 1 6  is  sent  out  from  the  TMRA 
to  the  data  memory  at  the  lower  half  clock  cycle.  The  data  as  seen  on  outjnem  has  an¬ 
other  half  clock  delay  caused  by  memory.  Signals  cid_l,  cid_0,  err  and  v_err  verily  that 
no  error  is  reported. 
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Figure  35 .  Simulation  Result  of  First  TMR  Assembly  with  Memories 
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The  final  conclusion  is  that  the  latch  added  in  Figure  29  does  not  help  when  the 
TMRA  is  connected  with  memories.  The  simulation  results  in  Figures  30  and  31  worked 
because  the  input  data  was  set  manually.  These  manual  changes  set  the  error  regardless 
of  the  changing  of  the  read  or  write  signals  from  the  processors.  Therefore,  a  latch  was 
needed  in  this  manual  test  bench. 

When  the  TMRA  is  connected  with  memories,  the  memories  will  interact  with  the 
write  signal  of  the  KDLX  even  though  the  detailed  interaction  among  them  are  not  visible 
in  the  test  bench.  A  latch  in  the  TMRA  in  this  design  will  ruin  the  timing  between  the 
TMRA  and  the  data  memory.  Thus,  the  simulation  result  in  Figure  34  shows  that  the 
TMRA  is  totally  unable  to  communicate  with  the  data  memory,  while  in  Figure  36,  with¬ 
out  the  latch  the  design  works. 

E,  TEST  ON  FAULT  TOLERANCY  OF  TMR  ASSEMBLY 

The  concept  of  the  TMR  Assembly  has  been  described  and  explained  earlier  in 
this  chapter.  The  usage  of  the  voters  has  been  emphasized  as  well.  Since  the  TMR  As¬ 
sembly  has  been  designed  and  simulated,  the  next  requirement  is  to  test  the  fault-tolerant 
ability.  In  order  to  provide  errors,  three  instruction  memories  are  necessary  and  more 
signals  need  to  be  monitored. 

1.  Schematic  and  Simulation 

Figure  36  is  a  complete  schematic  with  all  of  the  components  for  the  fault-tolerant 
testing.  The  concept  is  to  change  one  of  the  instructions  loaded  into  the  TMRA  and  see  if 
the  voters  can  catch  the  error,  correct  it,  and  report  it.  Since  the  inconsistent  instruction 
will  lead  one  of  the  KDLX  processors  to  do  something  different  that  the  other  two,  voters 
should  flag  the  inconsistency  and  point  out  the  faulty  processor,  i.e.,  either  cid_l  or  cid_0 
or  both  should  not  be  zero.  Some  bits  in  the  error  detection  bus,  err,  ought  to  be  1  when¬ 
ever  any  error  exists.  If  all  these  signals  work  properly,  the  TMRA  will  be  able  to  catch 
an  error  and  trigger  an  interrupt  routine. 

Three  instruction  memories,  ROM  A,  ROM  B  and  ROM  C,  are  pre-configured 
with  three  different  instruction  maps.  The  data  memory  at  the  right  side,  RAM,  has  non- 
repeated  value  in  its  memory  locations.  This  makes  the  data  in  the  simulation  more  eas- 
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ily  identified  since  each  memory  address  holds  a  unique  value.  Memory  maps  for  the 
ROMs  and  RAM  are  displayed  in  Table  19. 


Figure  36.  Schematic  for  Fault-Tolerant  Testing 
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Table  19.  Instruction  And  Data  Memory  Maps 


The  inconsistent  instructions  are  grayed  out  in  Table  19.  The  TMR  Assembly 
simulation  is  shown  in  Figures  37,  38,  and  39. 
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Figure  37.  Simulation  of  Fault-Tolerant  Testing 


69 


AEstbenctVclk  □  ~1 

1 - 1 

1 - 1 

1 - 1 

I - 1 

I - 1 

I - 

n 

/tEstbench/ai  mn 

/twhprvrh/pn  ram 

/■testDenctVacIclr  rcwn  04 

j;o5 

_ 

nro7 

jm: 

J(w 

/testbench/sddr  ram  00 

J[OA 

J|OB 

^oc 

:^i55 

IP 

/testbenctVinstr  Dassa  440203 

>440400 

Innnonn 

/cestbenchOnstr  cassb  44020B' 

1<MOAOC 

1440400 

lOOOOOD 

yte5tber>ch/instr_pa5sc  44020B _ 

144030C 

7350911 

Toooooo 

/testbefidVreset_p 

/testtench/slalLp 

/'testbencti/Broa  d 

J  L_ 

1 

I 

J 

I 

> 

/testbench/ r«ad_p 

- 1 

J - 1 

J - 1 

J - 

I - 

/testbench/write_p 

/rp:4benrh/n(1  t  xxxxxcxxkhooo  rooooKwomiiD  1n»K 

[xxXKffiOODOOOO  ToownAcsoo 

>OOXnKWOO{X»  1«a»oaa^Xioao 

^XXXXXOOOOOOOO 

[xxxxxoooooooo 

/■fP«atiAnrh/rirt  n  XWCXX 00000000  rncKXxroooooco  loom 

T 

XXXYXOOOOOCHO  lamfnaEoa 

1 

OOCXXDOOOOOOO  laaxxofcixna] 

M 

miXKDOtJOOlWS  1<KM« 

[IXXXKXOOOOOOOO 

4 

/recHnsnrh/prr  XXXXXOOOOOOOO  rKXtoocKKOnwo  Idodd 

J 

XSOOraOOOOOlMI  laMa4>on»«i 

I 

•KKXtBOOOOOOO  lanaBtmaiD 

n 

XXXXOOOOOOCiB  It«n 

llxxxxxoooooooo 

“Toooocoooooooo 

i 

,  *  V 

/testbench/v  err  XXXXXOOOOOOOO  1000«%i0000 

4 

|xo(X)<Daooo(X)o  looooodBi^ 

og^KMooaiMOOdM  JooomoAbooo 

eoMKOooocnoD  Sooon 

ilPOOodi/IXXXXXOQOOOOOO 

/testbench/out  man  0020 

1002A 

7o02B 

?0O2C 

7oo2d 

7o020 

Figui 

poi 

e  38.  Sim 

t 

point  3 

nt  2  poi 

ulation  of  Fj 

poi 

nt  4 

mlt-Toleran1 

nt  5 

poi 

Testing  (eo 

t 

point  7 

nt  6 

ntinued) 

i 

poi 

k 

nt  8 

Figure  39.  Simulation  of  Fault-Tolerant  Testing  (eontinued) 
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In  Figure  37,  when  the  signal  reset _p  goes  from  low  to  high,  the  TMRA  starts 
fetching  instructions.  Notice  the  signal  outjnem  shows  20i6  which  is  the  first  value  at 
address  OOie.  The  instructions  at  address  03  le  of  the  ROMs  are  fetched  at  point  1.  Fol¬ 
lowing  that,  three  more  instructions  are  fetched  in  sequence.  The  first  instruction, 
44010Ai6,  is  executed  at  point  2  in  Figure  38  while  addr_rom  is  05i6  and  addr_ram  is 
0Ai6.  The  addr_rom  contains  the  address  of  the  instruction  being  fetched,  i.e.,  05i6.  The 
addr_ram  contains  the  address  that  the  first  instruction,  i.e.,  44010Ai6,  is  using  to  access 
RAM.  In  this  case,  OAie  is  the  correct  address  for  this  first  instruction. 

From  this  point  in  the  simulation,  inconsistencies  have  been  introduced  in  the  in¬ 
struction  memory.  The  bit  distribution  of  the  bus  needs  to  be  introduced  in  the  next  sec¬ 
tion  before  the  simulation  analysis  is  presented. 

2,  Bit  Distribution 

Recall  the  schematic  in  Figure  26.  Four  signals  (i.e.,  V  ERR,  CID  O,  CID  l,  and 
ERR)  are  collected  into  four  different  buses  and  each  bus  is  5 1-bit  wide.  Since  one  5 1-bit 
bus  consists  of  outputs  from  6  different  voters,  each  voter  has  a  range  in  the  bus  distribu¬ 
tion.  By  looking  at  the  bits  in  the  distribution,  one  can  tell  which  signal  on  which  proces¬ 
sor  is  wrong.  The  bit  distribution  for  CID  l,  CID  O,  and  ERR  is  shown  in  Figure  40. 
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Figure  40.  Bit  Distribution  of  CID  l,  CID  O  and  ERR  Buses 


In  Figure  40,  the  bit  distributions  of  all  three  buses  are  identical.  For  example,  a  1 
at  bit  20  of  the  ERR  bus  means  that  one  of  the  KDLX  processors  has  an  error  in  its  pro¬ 
gram  counter.  At  the  same  time,  bit  20  of  the  CID  l  and  CID  O  buses  will  point  out  the 
faulty  processor. 

3.  Simulation  Analysis 

The  three  instructions  fetched  by  the  TMRA  at  point  1  in  Figure  37  are  identical  so 
no  error  is  reported  at  point  2.  Since  there  is  no  error  in  any  one  of  the  processors,  the 
cid_I  and  cid_0  buses  will  not  identify  any  processor.  It  was  mentioned  that  the  memory 
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needs  a  half  clock  cycle  to  send  out  data  once  it  receives  signals.  That  is  why  the  first 
data  is  not  on  the  out  mem  bus  until  point  3.  It  can  be  verified  that  the  TMRA  is  loading 
a  correct  value. 

When  the  instructions  become  inconsistent,  the  error  detection  signal  is  no  longer 
zero.  Meanwhile,  the  cid_l  and  cid_0  locate  the  faulty  processor.  This  can  be  checked 
from  point  4  to  6.  Figure  41  is  the  bit  distribution  of  the  error  detection  signals  for  the 
first  Opcode,  44010Ai6.  The  hexadecimal  number  in  the  simulation  is  translated  to  a 
binary  number  when  doing  this  data  analysis. 
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Figure  41 .  ERR  Analysis  for  the  First  Opcode 


It  is  obvious  that  the  sixth  bit  is  inconsistent  in  three  processors.  In  order  to  verify 
the  error,  the  signals  cid  l  and  cid  O  should  be  analyzed.  Converting  the  hex  numbers  in 
the  simulation  to  binary  numbers  and  comparing  the  bit  distribution  with  Figure  40  indi¬ 
cates  that  (Figure  42)  the  inconsistent  bit  is  on  the  address  bus  and  Processor  A  is  the 
faulty  processor.  Recall  from  Table  17  that  cid_l  is  the  most  significant  bit,  so  OI2  stands 
for  the  first  processor  (i.e..  Processor  A).  It  is  true  that  the  instruction  at  address  01 16  in 
ROM  A  is  the  actual  location  of  the  error,  but  since  this  instruction  is  only  sent  to  the  first 
processor  in  the  TMRA,  Processor  A  is  identified  as  faulty. 
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Figure  42.  CID  l  and  CID  O  Analysis  for  the  First  Opcode 
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The  reason  that  the  error  is  at  bit  6  is  because  that  is  the  only  location  where  the 
output  bits  are  not  consistent  in  the  three  processors.  Figure  43  shows  the  situation. 


Ilex 

Binan 

Correct  Address 

OB 

0000  101 1 

Wrona  Address 

03 

0000  0011 

t 


bit  3 

Figure  43.  Address  Comparison  for  the  First  Opcode 

The  second  Opcode  in  ROM B  has  an  incorrect  destination  register.  Since  there 
are  no  output  signals  on  KDLX  for  the  destination  register,  point  4  in  Figure  38  reports 
no  error,  even  though  this  wrong  Opcode  loads  a  correct  data  into  the  wrong  register. 

The  contents  of  R3  are  now  inconsistent  between  the  three  processors  as  are  the  contents 
of  RIO.  This  kind  of  error  will  only  be  found  when  the  content  of  the  faulty  register  is 
used.  Point  9  in  Figure  39  stores  the  contents  of  R3  to  memory  location  09i6.  It  is  known 
that  the  data  in  R3  is  wrong  in  Processor  B,  but  the  Opcode  difference  at  point  9  also 
means  that  the  memory  address  of  Processor  C  is  wrong.  Figure  44  shows  the  simulation 
result  for  point  13  in  Figure  39.  Six  inconsistent  bits  were  caught. 
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Figure  44.  ERR  Analysis  at  Point  13 


The  contents  of  R3  in  Processor  B  are  zero,  but  in  Processors  A  and  C  they  are 
2Ci6.  For  cid_l  and  cid_0,  it  is  expected  that  the  data  portion  in  the  bit  distribution  indi¬ 
cates  that  Processor  B  is  wrong.  Figure  45  shows  the  inconsistent  bits  between  the  cor¬ 
rect  and  wrong  data. 
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Figure  45.  Data  comparison  for  R3 


The  bit  distribution  of  cid  l  and  cid  O  should  put  002Ci6  in  the  data  portion  and 
indicate  all  inconsistencies  caused  by  Processor  B.  Figure  46  illustrates  that  it  does. 


Processor  B 


Figure  46.  CID  l  and  CID  O  Data  Portion  Analysis  at  Point  13 


In  addition,  the  address  differences  from  Processor  C  at  point  9  should  also  be  in¬ 
dicated  by  cid  l  and  cid  O.  This  is  shown  in  Figure  47. 


74 


Ilex 

Binarv 

Address  of  A  and  B  (correct) 

09 

0000  1001 

Address  of  C  (wrona) 

02 

0000  0010 

inconsistent 

portion 


progrd 


^  V _ )  K _ ; 

Ilex  0 . 0  5  8 

Figure  47.  CID  l  and  CID  O  Address  Portion  Analysis  at  Point  13 

Notiee  that  both  cid  l  and  cid  O  at  point  13  have  hex  number  58.  The  inconsis¬ 
tent  bits  of  the  addresses  are  reflected  correctly  in  the  bit  distribution.  The  Processor  C  is 
identified  as  the  faulty  one  that  gives  a  different  address  to  the  voter  than  the  others.  This 
proves  that  cid_l,  cid  O  and  err  signals  can  deal  with  these  kinds  of  multiple  errors  and 
still  report  flawlessly. 

Following  the  same  procedure  to  analyze  data  on  buses,  one  should  be  able  to  re¬ 
alize  how  the  voter  works  and  the  way  to  utilize  these  signals  for  an  interrupt  routine. 

The  rest  of  the  simulation  also  performs  correctly.  The  Opcode  at  address  06i6  of  ROM 
C  is  a  disaster  since  there  is  no  such  instruction.  Based  on  the  experience  just  learned, 
this  kind  of  error  will  still  be  corrected.  The  inconsistency  of  register  contents  will  be 
corrected  the  next  time  they  are  used  and  the  wrong  addresses  will  not  affect  anything  as 
long  as  the  other  two  addresses  are  correct.  Correct  data  will  still  be  fetched  at  point  7  in 


75 


Figure  38.  The  memory  output  data  bus  switehes  baek  to  0020i6  at  point  8.  Next,  three 
store  instruetions  are  fetehed  in  series.  The  first  data  written  to  memory  shows  up  at 
point  10.  Simple  address  ineonsisteneies  at  point  1 1  and  12  are  easily  analyzed.  Errors  at 
point  14  are  deteeted,  even  though  all  three  Opeodes,  450410i6,  are  the  same.  That  is  be- 
eause  the  data  loaded  into  R4  earlier  was  different  and  the  error  oeeurs  only  when  R4  is 
routed  to  the  output. 

F.  IMPORTANT  SIMULATION  CONCEPTS  REVIEW 

Simulation  results  are  used  a  lot  in  this  ehapter  to  explain  the  operation  of  the 
TMR.  Fundamental  ideas  on  how  to  eonstruet  a  test  beneh  and  how  to  analyze  results 
have  been  established.  Due  to  the  different  properties  of  the  different  eomponents,  a  de¬ 
sign  may  not  work  when  additional  eomponents  are  eonneeted.  Generating  a  good  test 
beneh  is  not  easy  sinee  most  timing  problems  are  unpredietable.  Some  important  knowl¬ 
edge  for  simulation  needs  to  be  introdueed  in  order  to  help  shrink  the  time  for  invention. 

1.  KDLX  Was  Designed  to  Work  with  Asynchronous  Memory 

In  a  personal  eonversation  with  Dr.  Kenny  Clark,  1  learned  that  the  KDLX  was 
designed  for  an  asynehronous  memory.  Although  it  will  work  with  a  synehronous  in- 
struetion  memory,  an  asynehronous  memory  is  reeommended  sinee  one  should  assume 
that  the  instruetion  memory  and  the  data  memory  are  in  the  same  physical  memory.  Al¬ 
ways  provide  some  different  time  constraints  between  KDLX  and  memories  when  gener¬ 
ating  a  test  bench. 

2.  Start  with  A  Simple  Test  Bench  First 

Trying  to  test  everything  on  a  new  design  is  a  bad  idea.  Too  many  signals  need  to 
be  tracked  and  multiple  errors  are  hard  to  debug.  It  is  a  good  idea  to  start  with  a  simple 
test  bench  which  only  tests  a  small  part  of  the  design.  Revise  the  test  bench  to  become 
more  complicated  step  by  step.  It  is  also  good  to  individually  test  every  component  gen¬ 
erated  before  constructing  a  top-level  design. 

3.  Test  Bench  Is  Optimized  for  the  Current  Design 

As  introduced  earlier,  the  simulations  have  different  time  constraints.  A  test 
bench  is  used  to  check  to  see  if  a  design  works  under  reasonable  assumptions.  Circuits 
will  be  modified  many  times  until  the  full  design  is  complete.  It  is  hard  to  specify  the 
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requirement  for  a  test  bench  before  a  circuit  is  actually  built,  so  it  is  almost  impossible  to 
have  an  ideal  test  bench  for  a  full  design  and  every  single  component.  In  addition,  a  test 
bench  that  works  on  the  top-level  design  may  not  fit  to  a  single  component.  Timing 
mismatches  always  change  with  different  wiring. 

4.  Keep  Old  Designs 

It  was  shown  in  the  TMR  Assembly  schematic  that  sometimes  an  old  design  is  the 
real  useful  one.  Incorrect  settings  for  a  test  bench  can  mislead  a  designer  to  make  a 
wrong  decision  and  a  modified  design  can  become  useless  when  other  components  are 
connected.  Features  on  different  components  sometimes  will  balance  out  timing  mis¬ 
matches  between  them.  Going  over  previous  designs  helps  a  designer  to  retrieve  original 
thoughts  and  keeping  those  files  available  is  important. 

5,  Working  on  the  Copy  of  Source 

Based  on  personal  experience,  it  is  good  to  add  a  copy  of  a  tested  circuit  into  a 
large  design  rather  than  adding  the  original.  This  not  only  keeps  the  integrity  of  the 
original  file  but  also  makes  it  easy  to  review.  Without  making  a  copy,  the  new  design 
will  associate  with  the  original  design.  Any  modification  in  the  new  design  directly  af¬ 
fects  the  original  file.  Therefore,  it  will  be  impossible  to  keep  the  original  source  file. 

Keeping  the  integrity  of  each  circuit  is  also  important.  People  always  want  to  see 
and  test  the  fundamental  design  before  they  jump  into  the  full  design.  For  example,  a 
new  designer  may  want  to  understand  voters  before  realizing  the  TMR  Assembly.  Mak¬ 
ing  all  correct  and  incorrect  circuits  into  one  project  is  convenient  for  a  designer,  but  this 
does  not  help  other  people  to  understand.  By  the  way,  having  all  sources  in  one  project 
lacks  independency  while  doing  individual  tests. 

There  is  no  question  that  making  a  copy  of  a  source  file  definitely  increases  the 
size  of  folder  and  requires  more  time  to  manage  individual  projects.  The  big  benefit  of 
this  is  that  a  designer  can  always  have  original  designs  in  hand  as  well  as  all  projects  left 
are  tested  and  ready  to  go.  A  new  designer  thus  has  a  chance  to  see  the  function  of  a 
voter  before  sinking  into  the  confusion  of  the  complete  TMR  Assembly.  Since  another 
new  project  will  be  generated  once  a  project  has  failed,  a  design  like  the  TMR  Assembly 
may  have  different  versions.  The  useful  version  contains  only  useful  schematics  and  test 
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benches.  From  this  point  of  view,  all  projects  left  are  not  only  useful  but  also  have  few  or 
no  junk  sources  inside. 

Since  hard  drive  space  nowadays  is  huge  and  cheap,  working  on  a  copy  fde  not 
only  gives  people  a  chance  to  review  but  also  make  all  projects  look  clean  and  easy  to 
understand. 

G.  CHAPTER  SUMMARY 

This  chapter  introduced  the  kernel  of  the  full  TMR  design,  i.e.,  the  TMR  Assem¬ 
bly.  Understanding  how  voters  catch  errors  and  how  to  analyze  simulation  results  is  the 
main  point  in  this  chapter.  Many  explanations  of  simulation  results  are  provided  in  order 
to  help  one  realize  the  spirit  of  the  TMR  design.  After  reading  so  many  simulations,  one 
should  have  a  feeling  on  how  to  use  and  generate  a  test  bench.  A  quick  review  on  simu¬ 
lation  concepts  is  put  at  the  end  of  this  chapter  after  one  has  studied  some  simulations  and 
before  he/she  jumps  into  a  more  complex  design. 

Other  components  associated  with  the  TMR  Assembly  like  the  Reconciler,  Inter¬ 
rupt  and  Error  Syndrome  Storage  Device  (ESSD)  will  be  explained  in  following  chapters. 
The  Reconciler  is  an  interface  between  KDLX  and  memory;  the  Interrupt  is  the  one  gen¬ 
erating  ISR;  the  ESSD  is  responsible  for  storing  error  syndromes  whenever  an  error  oc¬ 
curs. 
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VI.  RECONCILER 


Due  to  the  different  memory  architeetures  between  KDLX  and  CFTP  as  deseribed 
in  Chapter  IV,  the  Reconciler  is  used  to  satisfy  the  timing  requirements  on  both  sides  and 
properly  route  the  data.  Sinee  KDLX  ean  only  access  memory  via  load  and  store  instruc¬ 
tions,  the  Reconciler  only  needs  to  monitor  the  read  and  write  signals  from  KDLX  and  di¬ 
rect  the  data  to  the  correct  destinations. 


In  this  chapter,  no  error  detection  or  correction  will  be  discussed  since  the  Recon¬ 
ciler  is  not  responsible  for  this.  The  TMR  Assembly  is  responsible  for  error  detection. 
Error  correction  is  done  by  the  Interrupt  and  the  voters  in  the  TMR  Assembly.  Storing 
the  error  syndromes  is  the  job  of  the  Error  Syndrome  Storage  Device  (ESSD). 

A.  CONSTRUCTION  AND  FUNCTION 


Only  one  physical  memory  is  available  in  the  CFTP.  In  order  to  make  this  one 
memory  act  as  the  both  instruction  memory  and  data  memory  in  each  KDLX  clock  cycle, 
the  physical  memory  has  to  run  at  twice  the  speed  of  KDLX.  For  the  same  reason  the 
Reconciler  has  also  to  run  twice  as  fast  as  KDFX.  For  each  KDFX  clock  cycle,  one  ad¬ 
dress  bus  access  and  one  data  bus  access  for  instructions  needs  to  be  available.  Mean¬ 
while,  one  address  bus  and  one  data  bus  access  for  data  also  needs  to  be  available.  To 
fetch  an  instruction  and  do  a  data  read  or  write,  the  Reconciler  has  to  act  as  an  instruction 
memory  in  the  first  half  of  the  KDFX  clock  cycle  and  act  as  a  data  memory  in  the  second 
half  of  the  KDFX  clock  cycle.  This  function  is  illustrated  in  Figure  48. 


KDLX  clock 


KDLX 

signals 


Memory  or 
Reconciler 
clock 


pc(15:0)  is  available 
instr(23:0)  is  available 
addr_int(15:0)  is  available 
data(15:0)  is  available 


Instruction  fetch  Data  read  or  write  \ 


Figure  48.  Illustration  of  Reconciler  Function 
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The  Reconciler  is  composed  of  a  state  machine  coded  in  VHDL  and  is  presented 
completely  in  Appendix  C,  section  A.  The  state  machine  contains  five  states:  one  starting 
point,  two  for  normal  operations,  one  for  read,  and  one  for  write.  This  function  can  be 
seen  clearly  in  Figure  49. 


State  1 


The  name  of  the  state  is  on  the  top  of  each  circle  except  for  the  initial  state  named 
State.  The  number  in  each  state  is  the  state  number  designed  for  tracking  purposes  in  the 
simulation.  The  two  normal  operations,  StateO  and  Statel,  are  identical  and  are  for  fetch¬ 
ing  instruction.  Without  reading  or  writing,  these  two  states  just  pass  the  program 
counter  to  memory,  fetch  the  instruction  and  send  it  back  to  the  KDLX.  At  this  time,  the 
memory  acts  as  a  ROM  and  its  data-input  bus  is  in  a  high  impedance  state.  Since  only 
the  instruction  bus  is  used,  the  data  bus  of  the  KDXL  is  also  in  a  high  impedance  state. 
State  Statel  is  a  duplication  of  StateO  so  the  state  machine  can  be  revised  to  stay  at  StateO 
when  neither  rd_r  nor  wr_r  is  0.  The  reason  for  using  two  states  is  to  provide  tracking  in 
simulation.  Since  the  Reconciler  runs  twice  as  fast  as  the  KDLX,  reading  and  writing  ac¬ 
tions  only  occur  at  StateO.  Without  the  separation  into  two  states,  it  is  hard  to  tell  if  a 
read  or  write  occurs  at  the  proper  state. 
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When  rd_r  is  0  and  wr_r  is  1 ,  the  state  machine  goes  to  the  ReadState.  KDLX 
wants  to  read  data  from  memory  so  the  Reconciler  will  pass  a  high  write  signal  to  the 
memory  and  direct  data  from  the  memory  to  KDLX.  When  rd_r  is  1  and  wr_r  is  0,  the 
Reconciler  knows  that  KDLX  wants  to  write  data  to  the  memory,  so  it  passes  a  low  write 
signal  to  memory  and  directs  data  from  KDLX  to  memory. 

The  initial  state,  State,  is  not  used  until  the  next  reset.  It  is  null  and  there  are  no 
actions  in  this  state.  Without  this  state,  the  state  machine  would  use  StateO  as  the  initial 
state  and  start  at  State  1  after  reset. 

B,  SCHEMATIC  AND  SIMULATION  OF  RECONCILER  ONLY 

Converting  a  VHDL  code  to  a  schematic  symbol  is  a  useful  function  in  the  ISE 
software.  The  schematic  symbol  of  Reconciler  is  shown  in  Figure  50. 
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wrout_r 

wr_r 

addrin_r(15:0) 
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pc_r(15  0) 

instr_data(23;0) 
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mem_data(15:0) 

Figure  50.  Schematic  Symbol  of  Reconciler 


Simulation  of  the  Reconciler  itself  is  quite  simple.  Since  it  is  basically  a  state 
machine,  a  state  will  either  stay  at  current  state  or  jump  to  a  new  state  every  clock  cycle. 
Figure  5 1  is  the  simulation  result. 
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Figure  5 1 .  Simulation  Result  of  the  Reconciler 
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The  signal  at  the  bottom  in  Figure  5 1  is  the  state  number  used  to  track  which  state 
is  active.  The  state  machine  starts  at  StateO  after  reset.  The  signal  addrout_r  is  the  bus 
connected  with  the  memory  address  bus.  It  sends  out  either  pe  r  or  addrin_r  depending 
on  whether  the  system  is  doing  an  instruction  fetch  or  a  data  read/write.  In  StateO  and 
State  1,  the  addrout_r  is  always  the  same  value  as  pe  r.  The  memory  data  output  bus 
connects  with  the  signal  datain_r  on  ReeoneUer  and  sends  out  either  an  instruction  or  a 
data  value.  When  rd_r  is  low,  data  on  datin_r  will  be  forwarded  to  mem_data  which 
connects  to  the  data  bus  of  KDLX.  When  wr_r  is  low,  the  state  machine  goes  to  the 
WriteState.  At  this  state,  data  from  KDLX  is  available  on  mem_data  and  ReeoneUer  will 
direct  this  data  to  dataout_r  which  connects  to  the  data  input  bus  of  memory. 

The  instrjdata  is  never  in  a  high  impedance  state  regardless  of  whether  the  data 
on  datain_r  is  an  instruction  or  not.  The  reason  is  to  make  an  instruction  stay  available 
until  the  next  KDLX  clock  cycle.  Even  during  ReadState  and  WriteState,  the  next  in¬ 
struction  for  the  KDLX  is  alive  on  the  instruction  bus.  Remember  that  the  ReeoneUer  is 
twice  as  fast  as  the  KDLX.  If  the  next  instruction  is  only  available  for  the  first  half  of  the 
KDLX  clock  cycle,  it  will  not  be  fetched  at  the  rising  edge  of  the  next  KDLX  clock.  This 
concept  will  be  described  again  when  the  ReeoneUer  is  hooked-up  with  a  KDLX  proces¬ 
sor. 

C.  SCHEMATIC  AND  SIMULATION  OF  RECONCILER  WITH  KDLX 

The  last  step  for  testing  the  ReeoneUer  is  to  simulate  it  with  a  KDLX.  The  sche¬ 
matic  of  this  part  of  the  design  is  shown  in  Figure  52. 
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Figure  52.  Schematic  of  Reconciler  with  KDLX  and  Memory 

The  memory  offered  in  the  ISE  software  is  not  a  real  Von  Neumann  architecture. 
Instead  of  having  one  bi-directional  data  bus,  the  Reconciler  is  designed  to  have  two 
separated  buses  for  data,  datain_r(23:0)  and  dataout_r(23:0).  The  mem_data(15:0)  on 
Reconciler  is  bi-directional  in  order  to  transfer  data  back  and  forth  with  the  KDLX. 
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The  simulation  for  this  circuit  is  done  with  a  series  of  load  and  store  instructions 
in  order  to  see  if  the  Reconciler  can  handle  both  instructions  and  data  correctly.  Figure 
53  is  the  first  part  of  the  simulation  result. 


Figure  53.  The  First  Part  of  the  Simulation  Result  for  Reconciler 

In  Figure  53,  the  first  instruction  in  memory  is  fetched  at  point  1  when  pc  _p  was 
sent.  It  can  be  seen  clearly  from  the  status  of  state_r  that  the  Reconciler  is  in  double 
speed.  At  point  2,  the  Opcode  440140i6  is  executed  and  wants  to  load  data  into  Rl.  At 
the  same  time,  the  KDLX  is  going  to  fetch  the  Opcode  440443  le.  The  address  of  data  for 
the  first  instruction  is  available  at  point  3  in  this  time  interval.  Therefore,  the  signal 
addrjn  fetches  pc _p  at  the  first  half  of  the  KDLX  clock  cycle  and  fetches  addr _p  at  the 
second  half  of  the  KDLX  clock  cycle.  The  data  at  memory  location  0040i6  thus  is  sent 
from  memory  to  KDLX  when  state_r  is  2.  Notice  that  at  this  time  interval  Opcode 
440443 16  is  available  on  the  bus  until  the  next  KDLX  clock.  This  is  important  since 
KDLX  is  triggered  at  the  rising  edge  of  the  clock.  Failure  to  keep  an  instruction  until  the 
next  rising  edge  will  mean  that  the  KDLX  will  not  be  able  to  fetch  this  instruction  and  the 
memory  location  for  data  will  not  appear  at  point  4.  This  is  why  the  instruction  bus  is  not 
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set  to  a  high  impedance  state  at  the  ReadState  and  WriteState  in  the  Reconciler.  The  rest 
of  this  simulation  in  Appendix  A,  section  H  does  a  series  of  writes  followed  by  a  series  of 
reads  in  order  to  check  if  the  Reconciler  functions  properly. 

D,  TIMING  CONCERNS 

An  added  complexity  for  this  simulation  is  the  fact  that  it  has  three  different 
clocks.  To  make  this  simulation  work,  the  time  constraints  of  the  test  bench  have  to  be 
set  properly.  The  sequence  of  execution  in  this  circuit  is  that  the  KDLX  sends  its  pro¬ 
gram  counter  to  the  Reconciler  first.  Then  Reconciler  forwards  this  address  to  the  mem¬ 
ory.  Next,  the  memory  selects  the  instruction  and  sends  it  to  the  Reconciler.  Finally,  the 
Reconciler  forwards  this  instruction  to  the  KDLX.  This  is  a  simple  example  of  how 
KDLX  fetches  an  instruction. 


In  order  to  successfully  fetch  an  instruction,  the  KDLX  has  to  have  its  program 
counter  ready  before  the  Reconciler  needs  it.  The  Reconciler  has  to  have  the  address  set 
before  the  memory  is  ready  to  receive  it.  Considering  setup  time  and  hold  time  for  each 
clock,  the  relationship  among  these  three  clocks  is  shown  in  Figure  54. 
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Figure  54.  Timing  Relationship  Among  Clocks 


It  does  not  matter  that  the  Reconciler  and  memory  clocks  are  faster  than  KDLX 
since  the  KDLX  has  to  be  ready  whenever  the  Reconciler  needs  data.  In  Figure  54,  all 
three  clocks  are  shown  together  as  they  were  in  the  simulation  for  comparing  timing  re¬ 
quirements.  Since  the  Reconciler  has  a  hold  time  longer  than  KDLX,  the  KDLX  will  be 
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ready  before  the  Reconciler  is  ready.  The  Reconciler  will  be  set  before  the  memory 
needs  input  signals. 

When  KDLX  is  executing  a  read-data  instruction,  the  memory  will  have  the  data 
available  later  than  the  KDLX  starts  to  read.  Therefore,  a  little  clipping  occurs  every 
time  that  KDLX  reads  data.  To  minimize  this  clipping,  the  setup  and  hold  time  between 
the  three  clocks  have  to  be  as  close  as  possible. 

In  this  simulation,  if  any  two  clocks  have  identical  setup  and  hold  time,  the  testing 
will  fail.  Since  the  Reconciler  is  a  state  machine,  the  current  state  will  jump  to  a  different 
state  if  the  conditional  requirements  are  not  met  in  time.  This  causes  the  KDLX  to  fail  to 
interact  with  the  memory;  therefore  the  following  instructions  will  not  be  fetched. 

E,  CHAPTER  SUMMARY 

This  chapter  introduced  the  function  of  the  Reconciler  in  the  TMR  design.  This 
component  is  designed  to  consolidate  two  different  architectures  in  a  circuit  and  is  not  di¬ 
rectly  associated  with  error  detection  or  correction  in  the  TMR.  This  is  the  first  time  in 
this  thesis  that  time  constraints  were  discussed  in  detail  since  there  are  specific  timing  re¬ 
quirements  for  the  Reconciler.  The  concept  of  establishing  the  setup  time  and  hold  time 
for  a  test  bench  is  more  important  after  this  chapter  because  more  components  are  in¬ 
volved  in  the  TMR  design. 

Another  component  (called  Interrupt)  is  discussed  in  the  next  chapter.  This  com¬ 
ponent  leads  the  TMR  design  to  the  Interrupt  Service  Routine  (ISR)  when  an  error  oc¬ 
curs.  How  to  intercept  the  current  execution  of  the  KDLX  to  start  an  ISR  and  how  it 
works  with  other  components  in  the  TMR  design  will  be  described  as  well. 
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VII.  INTERRUPT 


The  TMR  Assembly,  consisting  of  processors  and  voters,  is  able  to  detect  an  error 
and  correct  it.  Even  though  voters  are  able  to  correct  errors  as  they  come  out  the  system, 
whichever  of  the  KDLX  processors  that  caused  the  error  will  still  have  the  wrong  data  in¬ 
side.  If  an  error  in  one  processor  is  not  corrected  in  time,  another  error  occurring  in  an¬ 
other  processor  may  not  be  detected  by  voters.  As  was  described  earlier  in  Chapter  V,  a 
majority  voter  is  not  able  to  handle  multiple  identical  errors. 

In  order  to  correct  an  error  in  the  KDLX,  the  normal  operation  has  to  be  stopped 
and  all  contents  of  registers  in  the  three  processors  have  to  be  voted.  The  voters  will  cor¬ 
rect  any  inconsistency  between  the  three  processors  in  this  process  while  storing  all  cor¬ 
rect  data  into  memory  and  then  reloading  them  back  into  the  original  registers.  Once  this 
procedure  is  done,  all  contents  of  registers  are  identical  between  the  three  processors. 

The  Interrupt  is  the  circuit  used  to  stop  normal  operation  and  switch  the  circuit  to  do  this 
error  correction. 

A,  CONSTRUCTION  AND  FUNCTION 

The  Interrupt  is  also  a  state  machine  coded  in  VHDL.  The  state  machine  is 
shown  in  Figure  55.  The  concept  is  to  have  it  look  for  the  error  detection  signal  from  the 
TMR  Assembly.  If  an  error  occurs,  it  will  latch  the  current  program  counter  and  send  out 
a  TRAP  instruction  to  processors.  Two  NOPs  follow  the  TRAP  instruction  in  order  to 
clean  the  pipeline  of  the  processors.  Only  two  NOPs  are  needed  because  the  TRAP  in¬ 
struction  will  start  to  be  executed  right  after  the  second  NOP.  Any  instruction  after  the 
second  NOP  will  either  be  useless  or  mask  out  instructions  that  the  TRAP  wants  to  fetch. 
After  the  second  NOP,  the  TMR  Assembly  is  in  the  ISR  and  the  Interrupt  waits  for  an 
REE  instruction  from  memory,  placed  to  mark  the  end  of  the  ISR. 

When  the  processors  receive  the  TRAP  instruction  sent  Ixom  Interrupt,  they  jump 
to  a  specific  memory  location  and  start  the  ISR  for  storing  and  reloading  the  contents  of 
all  of  the  registers.  The  last  instruction  in  the  ISR  is  the  REE  instruction.  When  memory 
sends  out  this  instruction,  it  will  be  seen  by  the  Interrupt  and  the  Interrupt  will  replace 
the  REE  instruction  with  a  new  Jump  instruction.  This  new  Jump  instruction  is  con- 
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structed  by  the  Interrupt  from  the  Opcode  C8i6  plus  the  latched  program  counter  to  force 
the  processors  to  jump  back  to  where  the  trap  occurred. 


err  =  1 


no  RFE 


Figure  55.  State  Machine  of  Interrupt 

Recall  the  function  of  TRAP  and  RFE  instructions  in  Table  13.  The  reason  to  re¬ 
place  the  RFE  instruction  with  a  Jump  instruction  is  because  the  REE  instruction  does  not 
jump  back  to  where  the  TRAP  instruction  occurs.  It  is  known  that  the  RFE  will  jump  to 
the  address  stored  in  the  lAR  which  is  two  clock  cycles  later  than  when  the  TRAP  oc¬ 
curred.  The  choice  was  between  revising  a  tested  version  of  KDEX  and  building  a  sepa¬ 
rate  circuit  to  be  able  to  generate  a  new  Jump  instruction.  The  separate  circuit  is  easier  to 
achieve  for  this  Interrupt  since  it  is  a  state  machine  and  is  coded  in  VHDL.  Eirst,  a  state 
machine  can  do  several  different  things  in  one  clock  cycle.  Because  the  new  Jump  in¬ 
struction  is  not  needed  until  the  BackState,  two  NOP  clock  cycles  are  sufficient  for  gen¬ 
erating  an  instruction.  Second,  data  on  different  buses  can  be  more  easily  combined  in 
VHDE  than  other  methods,  e.g.,  schematics. 

The  Reconciler  discussed  in  the  previous  chapter  only  allows  an  instruction  to  be 
fetched  in  the  first  half  of  the  KDEX  clock  cycle,  but  the  state  machine  shown  in  Figure 
55  works  with  a  KDEX  at  the  same  speed.  In  order  to  interrupt  and  insert  instructions  at 
the  correct  timing,  the  Interrupt  has  to  match  the  speed  of  the  Reconciler.  Doubling  the 
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speed  of  the  Interrupt  is  not  the  same  as  that  of  the  Reconciler  since  the  Interrupt  has 
several  different  states  in  series.  The  methodology  here  is  to  duplicate  each  state,  which 
makes  the  state  machine  twice  as  long.  The  new  state  machine  is  shown  in  Figure  56  and 
its  VHDL  code  is  in  Appendix  C,  section  B. 


reset  i  =  0 


NopStateOA 


NopStateOB 


NopStatelA 


Figure  56.  New  State  Machine  of  Interrupt 


The  first  two  states,  StateO_A  and  StateO  B,  do  not  need  to  be  duplicated  in  spite 
of  the  even  number  of  states.  The  state  machine  is  also  revised  so  that  only  StateO  B  can 
go  to  TrapState_A.  In  spite  of  double  speed,  StateO_A  still  needs  to  go  to  StateO  B  even 
if  an  error  occurs  at  StateO_A.  On  the  other  hand,  the  KDLX  reads  and  writes  data  at  the 
falling  edge  of  clock,  which  means  that  a  data  error  always  occurs  at  StateO  B.  After 
NopStatel _B,  the  TMR  design  starts  the  ISR  and  the  WaitStateJB  waits  for  the  RFE  in¬ 
struction.  Once  the  RFE  instruction  is  sent  out  from  memory,  the  Interrupt  takes  over  the 
instruction  bus  again  and  injects  the  new  Jump  instruction  at  the  BackState_A.  The  TMR 
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design  goes  baek  to  normal  operation  when  the  new  Jump  instruction  is  executed  by  the 
processors. 

B,  SCHEMATIC 

The  functions  of  Interrupt  can  be  easily  understood  from  the  simulation  result 
shown  in  Appendix  A,  section  I.  The  simulation  for  the  Interrupt  only  is  not  explained 
here  since  the  statej  indicates  active  states  in  Figure  56  explicitly.  Figure  57  is  the 
schematic  symbol  of  Interrupt. 


interrupt 


clk_i 

sel_i(23:0) 

reset  i 

pc_out(1o:0) 

err 

trapj(23:0) 

rfe_i(23:0) 

pc_in(15:0) 

state_i(3:0) 

Figure  57.  Schematic  Symbol  of  Interrupt 


The  input  signal  err  is  used  to  monitor  the  occurrence  of  an  error.  When  this  sig¬ 
nal  goes  high,  the  ISR  starts.  Once  the  ISR  is  triggered,  the  program  counter  where  the 
error  occurs  is  sent  to  pc_in(I5:0)  where  it  will  be  latched  and  this  latched  program 
counter  will  be  output  instantly  at  pc_out(I5:0).  The  Interrupt  uses  signal  sel_i(23:0)  to 
switch  a  mux  and  sends  out  the  TRAP  instruction  via  trap _i(2 3:0).  After  that,  sel_i(23:0) 
switches  the  mux  back  to  normal  and  the  input  signal  rfe_i(23:0)  starts  monitoring  the 
Opcodes  passing  through  on  the  instruction  bus.  When  the  RFE  instruction  is  sent  out 
from  memory,  sel_i(23:0)  actives  again  and  trap_i(23:0)  sends  out  the  new  Jump  instruc¬ 
tion.  Consequently,  the  TMR  design  is  back  to  its  normal  operation.  Figure  58  is  the  de¬ 
sign  of  the  Interrupt  with  a  processor  and  two  memories. 
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Figure  58.  Schematic  of  the  Interrupt  with  KDLX  and  Memories 
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The  mux  located  between  instruction  memory  and  KDLX  is  used  for  Interrupt  to 
inject  the  TRAP  instruction.  Normally,  the  KDLX  fetches  instructions  from  the  instruc¬ 
tion  memory  and  the  mux  allows  this  traffic  to  pass.  When  an  error  occurs,  the  mux  con¬ 
trolled  by  Interrupt  immediately  switches  to  the  other  bus  and  a  TRAP  instruction  gener¬ 
ated  by  the  Interrupt  will  be  sent  to  the  KDLX.  The  original  instruction  at  this  time  is 
blocked  on  the  bus  and  the  KDLX  receives  the  TRAP  instruction  instead.  The  Opcode 
for  the  TRAP  instruction  in  this  thesis  is  280030i6  which  uses  memory  location  OOSOie  as 
the  starting  point  of  the  ISR.  This  value  can  be  easily  changed  in  Interrupt'?,  VHDL 
code.  The  basic  idea  is  not  to  have  the  ISR  address  too  close  to  the  address  of  normal  op¬ 
erations  in  memory  to  keep  it  from  being  overwriten.  Simulations  in  this  thesis  are  care¬ 
fully  designed  and  small  address  spaces  let  people  see  the  complete  implementation  in 
memories. 

C.  SIMULATION 

Table  20  shows  the  contents  of  the  memories  and  the  registers  before  and  after  the 
simulation. 


Register 

00 

01 

0044 

02 

0045 

03 

0046 

04 

0047 

05 

0048 

06 

0049 

07 

004A 

08 

004B 

09 

004C 

10 

0055 

11 

0066 

12 

0077 

13 

14 

15 

Data  Mem 

00 

01 

0044 

02 

0045 

03 

0046 

04 

0047 

05 

0048 

06 

0049 

07 

004A 

08 

004B 

09 

004C 

OA 

OB 

OC 

OD 

OE 

OF 

10 

0044 

11 

0045 

12 

0046 

13 

0047 

14 

0048 

15 

0049 

16 

004A 

17 

004B 

18 

004C 

19 

Instruction  Mem 

00 

2D 

01 

2E 

02 

440101 

2F 

03 

440202 

30 

000000 

04 

440303 

31 

000000 

05 

440404 

32 

000000 

06 

440505 

33 

450420 

07 

440606 

34 

450520 

08 

440707 

35 

450620 

09 

440808 

36 

450720 

OA 

440909 

37 

411A11 

OB 

450110 

38 

411B22 

OC 

450211 

39 

411C33 

OD 

450312 

3A 

000000 

OE 

450413 

3B 

000000 

OF 

450514 

3C 

000000 

10 

450615 

3D 

F80000 

11 

450716 

3E 

000000 

12 

450817 

3F 

000000 

13 

450918 

40 

000000 

14 

450A19 

41 

15 

450B1A 

42 

16 

450C1B 

43 

44 

45 

2C 

46 

Table  20.  Tables  of  Registers  and  Memories  in  Simulation 
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Part  of  the  complete  simulation  is  shown  in  Figures  59  and  60.  An  error  is  seen  at 
point  1  and  the  instruction  at  point  2  is  intercepted  by  the  Interrupt.  It  can  be  seen  clearly 
that  the  value  of  signal  sel_i  changes  and  a  TRAP  instruction  followed  by  two  NOPs  are 
injected  at  point  3. 


One  important  thing  here  is  that  the  time  an  error  is  seen  is  not  the  time  an  error 
occurs.  The  reason  is  because  the  KDLX  is  pipelined  and  the  memory  stage  is  the  fourth 
pipeline  stage.  Including  the  time  for  the  Interrupt  to  respond,  the  total  delay  from  the 
instruction  causing  the  error  is  four  KDLX  clock  cycles.  This  feature  cannot  be  seen  in 
this  simulation  because  the  error  was  set  manually. 

The  program  counter  latched  by  the  Interrupt  at  point  3  is  0008 16  in  this  simula¬ 
tion.  The  instruction  intercepted  is  440606  le  which  is  at  address  07 1 6  in  Table  20.  The 
concept  is  to  jump  back  to  where  the  TRAP  was  inserted.  Theoretically,  the  program 
counter  latched  should  be  OOOVie  not  OOOSie.  Because  of  the  change  of  the  pc _p  at  point 
3  and  the  instruction  delay  from  memory,  the  latched  program  counter  is  a  wrong  value. 
Another  possible  reason  is  since  this  error  is  generated  from  the  test  bench  not  from  the 
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circuit  itself,  the  timing  for  the  oecurrence  of  an  error  could  be  in  the  wrong  place.  This 
issue  will  be  discussed  again  and  resolved  in  Chapter  VIII  when  the  full  design  without 
ESSD  is  presented. 

The  TRAP  instruetion  inserted  at  point  3  affeets  the  eircuit  at  point  5.  Opeodes 
from  instruction  memory  address  30i6  to  40i6  are  the  ISR.  Instructions  in  the  ISR  can  be 
related  or  unrelated  to  the  original  commands,  but  the  purpose  is  to  correct  the  error. 
Since  there  is  no  actual  error  in  this  simulation,  the  ISR  is  designed  just  to  do  something 
else.  The  full  function  of  the  real  ISR  is  to  store  all  eontents  of  the  registers  to  memory 
and  reload  these  contents  back  to  registers.  The  ISR  in  this  simulation  is  ineomplete. 


Figure  60.  Partial  Simulation  Result  of  Interrupt  with  KDLX  (eontinued) 


Storing  the  contents  of  R4  to  R7,  the  simulation  shows  R6  and  R7  at  point  6  are 
not  loaded  with  any  value.  This  proves  that  the  Interrupt  ean  suceessfully  insert  the 
TRAP  instruction.  At  point  7,  the  RFE  instruetion  (i.e.,  FSOOOOie)  is  deteeted  by  the  In¬ 
terrupt.  Instantly,  sel_i  switches  to  zero  and  trapj  sends  out  the  new  Jump  instruetion, 
C80005i6.  As  deseribed  earlier,  the  new  Jump  instruction  is  formed  from  (C8i6+latched 
program  counter).  Therefore,  the  Opeode  C80005i6  is  generated  and  exeeuted  at  point  8 
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The  rest  of  simulation  in  Appendix  A,  seetion  J  cheeks  the  contents  of  registers  to  verify 
the  operation. 

D,  CHAPTER  SUMMARY 

The  functions  of  the  Interrupt  were  described  and  simulated  in  this  chapter. 

When  an  error  occurs,  the  Interrupt  should  lead  the  TMR  design  to  do  error  correction 
and  also  be  able  to  bring  the  circuit  back  to  its  normal  operation.  The  purpose  is  to  cor¬ 
rect  an  error  as  soon  as  possible  after  it  occurs.  Thus  the  error  will  not  be  propagated 
making  the  circuit  lose  control. 

The  first  design  of  the  Interrupt  was  to  replace  instructions  in  memory  in  order  to 
implement  the  ISR.  This  could  not  be  done  in  this  design  because  a  ROM  is  used  as  the 
instruction  memory.  Since  the  real  CFTP  design  uses  only  one  RAM,  the  instruction  set 
could  be  changed  in  memory.  However,  changing  original  instructions  is  the  last  thing 
people  want  to  do  because  it  may  cause  an  unrecoverable  error. 

In  the  next  chapter,  the  full  design  without  ESSD  will  be  introduced.  The  usage 
of  the  ISR  will  be  described  clearly  and  the  interactions  between  Interrupt  and  Reconciler 
will  be  expressed  as  well.  The  simulation  of  the  full  design  should  clarify  any  confu¬ 
sions  among  the  different  components. 
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VIII.  THE  FULL  DESIGN  WITHOUT  ESSD 


The  full  design  in  this  chapter  consolidates  the  TMRA  from  Chapter  V,  the  Recon¬ 
ciler  from  Chapter  VI  and  the  Interrupt  from  Chapter  VII.  The  TMRA  contains  three 
KDLX  processors  and  six  voters.  All  outputs  of  the  processors  are  voted  and  any  error 
will  be  corrected.  The  Reconciler  is  responsible  for  integrating  the  Harvard  and  Von 
Neumann  architectures.  It  runs  in  double  speed  in  order  to  act  as  an  instruction  memory 
in  the  first  half  of  the  KDLX  clock  and  as  a  data  memory  in  the  second  half  of  the  KDLX 
clock.  The  component  used  to  correct  errors  besides  the  voters  is  Interrupt.  It  intercepts 
normal  operation  of  the  TMRA  when  an  error  occurs,  forces  it  to  do  an  ISR  and  makes  it 
jump  back  to  normal  operation  after  the  error  is  corrected.  The  error  signal  for  the  Inter¬ 
rupt  is  given  by  the  TMRA.  For  this  design  the  voter  is  assumed  to  be  error- free  and  the 
voter  error  detection  signal  is  not  used. 

Each  component  discussed  earlier  has  been  simulated  to  prove  its  function  with  or 
without  the  KDLX  and  memories.  Simulating  all  these  components  together  in  a  circuit 
should  be  able  to  catch  and  correct  an  error.  This  is  the  goal  for  the  full  design  and  its 
function  will  be  proved  in  this  chapter. 

A.  SCHEMATIC 

The  TMRA  itself  basically  connects  with  the  memories  as  just  one  KDLX  would. 
Most  input  and  output  buses  are  the  same  except  the  number  of  signals  increases  or  de¬ 
creases.  The  Reconciler  sitting  between  the  TMRA  and  the  memory  has  to  receive  all 
output  signals  that  the  original  KDLX  has,  except  the  program  read  signal,  i.e.,  the  read 
and  write  signals,  the  program  counter,  the  address  for  data,  and  the  data  bus.  The  Inter¬ 
rupt  needs  the  error  signal  to  trigger  the  ISR,  the  program  counter  to  generate  a  new 
Jump  instruction,  and  instructions  for  doing  TRAP,  RLE  and  Jump. 

In  order  to  test  the  circuit,  several  buses  and  memory  have  to  be  triplicated.  The 
way  to  test  the  error  handling  of  the  system  is  to  program  an  inconsistency  into  one  of  the 
three  memories  and  expect  that  the  circuit  can  catch  the  error  and  correct  it.  Without  this 
artifice,  the  Interrupt  will  never  work  and  the  ISR  will  never  be  triggered.  The  alternate 
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Figure  61 .  The  Full  Design 
98 


In  Figure  61,  only  the  Interrupt  is  unehanged  sinee  it  does  not  have  any  data  bus 
eonnections.  Three  RAM?,  are  used,  and  a  bus  connects  each  to  one  of  the  processors. 
Therefore,  both  Reconciler  and  TMRA  have  more  buses  than  before.  The  three  muxes  at 
the  bottom  left  are  used  to  intercept  the  TRAP  and  Jump  instructions.  The  box  at  the  top 
left  (called  orSltol)  is  coded  by  VHDL  and  ORs  51  bits  from  ERR(50:0)  into  1  bit.  Any 
error  that  occurs  at  any  output  signal  of  the  KDLX  will  trigger  the  ISR.  The  revised 
VHDL  code  lox  Reconciler  is  in  Appendix  C,  section  C. 

Because  the  Interrupt  must  monitor  a  memory  bus  in  order  to  detect  the  RFE  for 
testing,  one  of  the  memories  must  always  be  correct.  This  design  chooses  RAM  A  as  the 
monitored  RAM;  therefore  its  contents  are  always  correct. 

B,  SIMULATION 

The  three  RAM?  are  pre-configured  as  shown  in  Figure  62.  In  order  to  express  the 
concept  of  the  TMR  and  keep  the  simulation  simple,  only  the  data  at  memory  location 
4Ci6  is  different  for  RAM B.  The  ISR  is  designed  to  start  at  address  30i6  and  end  at  3Ci6. 
What  the  ISR  does  is  to  store  contents  of  registers  to  memory,  relying  on  the  voters  to  en¬ 
sure  that  the  correct  contents  are  written  into  memory.  (In  the  real  circuit,  the  ISR  then 
restores  all  registers  from  these  correct  values  in  memory.)  The  Opcode  FSOOOOie  is  the 
RFE  instruction  used  to  tell  Interrupt  where  the  end  of  the  ISR  is.  Instructions  from  ad¬ 
dress  0Ai6  to  10i6  are  used  to  check  data  in  registers. 
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RAM  A,  B  and  C 


00 

000000 

2D 

01 

000000 

2E 

02 

440 14A 

2F 

03 

44024B 

30 

45014A 

04 

44034C 

31 

45024B 

05 

44044D 

32 

45034C 

06 

44054E 

33 

45044D 

07 

44064E 

34 

45054E 

08 

000000 

35 

45064F 

09 

000000 

36 

000000 

OA 

000000 

37 

000000 

OB 

440 14A 

38 

000000 

OC 

44024B 

39 

F80000 

OD 

44034C 

3A 

000000 

OE 

44044D 

3B 

000000 

OF 

44054E 

3C 

000000 

10 

44064E 

3D 

11 

000000 

3E 

12 

000000 

13 

000000 

14 

000000 

4A 

OOOOAA 

4B 

OOOOBB 

4C 

OOOOCC 

4D 

OOOODD 

4E 

OOOOEE 

4F 

OOOOFF 

2C 

50 

RAM  B  has  0001 1 


Figure  62.  Memory  Pre-eonfigurations 


Figures  63,  65,  and  66  display  the  full  simulation  result  and  some  trivial  signals 
are  not  shown.  There  are  four  cloeks  in  this  design.  Cloek  signals  elk _p,  clk_i,  clk_r,  and 
clkjn  are  for  the  KDLXs,  Interrupt,  Reconciler,  and  RAM?,,  respeetively.  The  KDLX 
cloek  runs  at  one -half  the  speed  of  the  others.  Since  the  Interrupt  does  not  need  signals 
from  the  Reconciler  and  vice  versa,  these  two  components  are  running  at  the  same  clock 
speed.  The  RAM?  are  looking  for  the  outputs  of  the  Reconciler  so  the  memory  clock  has 
the  longest  setup  and  hold  time. 
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The  KDLXs,  Interrupt  and  Reconciler  are  reset  at  point  1  and  only  rest _p  for 
proeessors  is  shown.  When  the  program  counter, _p,  is  0002i6,  the  first  instruction  is 
fetched.  It  is  known  that  the  instruction  at  point  2  should  cause  an  error  because  the  data 
at  address  4Ci6  is  not  consistent  between  RAMs.  Tracing  the  simulation  to  point  3,  the 
function  of  the  Reconciler  is  shown  clearly  here.  Half  of  the  KDLX  clock  cycle  is  fetch¬ 
ing  the  instruction  at  the  corresponding  program  counter  and  the  other  half  cycle  is  read¬ 
ing  data  from  the  memory  for  the  first  instruction.  So  the  Reconciler  actually  reads  the 
instruction  at  memory  address  0005  le  first  and  then  reads  the  data  at  address  004Ai6. 

This  feature  makes  it  possible  to  consolidate  the  two  different  architectures.  As  discussed 
earlier,  the  instructions  should  be  held  until  the  next  rising  edge  of  the  KDLX  clock. 

Thus  the  Reconciler  should  not  block  any  data  or  make  a  bus  high  impedance  on  instr_ra, 
instr_rb,  and  instr_rc. 
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Instructions  at  point  2  are  executed  one  KDLX  clock  cycle  after  point  3.  The  data 
needed  for  these  instructions  is  offered  at  point  4.  The  wrong  data  in  RAM  B  is  sent  to  R3 
of  the  second  KDLX  in  the  TMRA  at  this  time.  It  is  hard  to  see  but  cid  O  and  cid_l  at 
point  5  do  report  errors.  The  main  purpose  for  this  simulation  is  to  show  how  different 
components  work  together  and  realize  the  concept  of  the  TMR.  Therefore,  the  error  re¬ 
ports  will  be  analyzed  later. 

Since  the  voters  are  hooked-up  to  the  output  buses  of  the  KDLXs,  it  may  be  con¬ 
fusing  that  the  TMRA  reports  an  error  while  it  is  loading  data  not  storing.  If  this  error  is 
not  seen  while  loading,  then  the  TMR  will  not  be  able  to  find  it  until  the  next  time  this  er¬ 
ror  is  stored  into  memory.  Figure  64  is  only  a  part  of  the  TMR  Assembly  in  Figure  26 
and  shows  how  input  data  flows. 


Figure  64.  Flowing  Direction  of  the  Input  Data  in  TMRA 


The  flowing  direction  of  the  input  data  to  the  KDLXs  is  expressed  clearly  in  Fig¬ 
ure  64.  Even  though  the  buses  on  the  voters  are  not  bi-directional,  the  input  data  can  still 
be  voted  by  this  scheme.  Therefore,  the  TMR  can  check  data  either  on  loading  or  storing 
without  waiting  until  the  wrong  data  is  used. 

Going  back  to  point  4  in  the  simulation  result.  An  error  is  caught  by  the  voter  so 
the  err_i  becomes  high  and  triggers  the  ISR.  At  point  6,  the  signal  sel_i  switches  to 
OOOOOO16  which  allows  the  Interrupt  to  insert  one  TRAP  instruction  and  two  NOPs  to 
TMRA.  Notice  that  the  statej.  changes  to  2i6  which  is  the  TrapState  of  Interrupt.  The 
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program  counter  latched  is  0008 16  so  the  TMR  should  jump  back  to  this  address  when  the 
ISR  is  done.  At  point  7,  the  TRAP  instruction  is  executed  by  the  KDLX  and  starts  the 
ISR  portion  in  Figure  62. 
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Figure  65.  Simulation  of  the  Full  Design  without  ESSD  (continued) 


The  implementation  in  this  ISR  is  to  store  all  contents  of  registers  to  memory.  All 
data  in  registers  will  be  voted  this  time  and  any  inconsistency  should  vanish.  The  wrong 
data  in  RAM B  ought  to  be  corrected  after  this  implementation.  Normally  the  ISR  will 
not  write  to  original  data.  The  reason  for  doing  this  here  is  because  this  test  is  to  prove 
the  ability  to  correct  an  error.  Thus  the  same  error  should  not  appear  next  time  when  the 
same  instruction  is  executed. 

The  contents  of  R3  shows  up  again  at  point  8  in  the  ISR.  Any  error  detected 
while  in  the  ISR  will  be  ignored  since  this  procedure  is  correcting  an  error  and  voters  will 
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take  care  of  other  errors.  The  err_i  flags  at  point  8  will  be  ignored  again  because  it  is 
known  that  the  data  in  R3  of  the  second  processor  is  wrong.  Signals  cid_0  and  cid_l  at 
this  point  report  the  same  error  syndrome  as  the  one  at  point  5.  It  could  be  explained  eas¬ 
ily  since  data  is  the  only  thing  having  a  problem.  If  the  third  Opcode  for  ISR  is  different 
in  one  of  the  processors,  signals  cid_0  and  cid_l  at  point  8  will  have  a  different  error 
syndrome.  It  could  be  seen  that  Interrupt  stays  at  the  WaitState  until  it  sees  the  RFE  in¬ 
struction. 

Once  the  Interrupt  detects  the  RFE  instruction  sent  out  from  the  RAM  A,  it  starts 
its  BackState  at  point  10.  The  instruction  buses  of  the  ReconcUer  (i.e.,  instr_ra,  instr_rb 
and  instr_rc)  are  forced  to  zero  at  point  9  when  the  RFE  instruction  is  detected.  The  RFE 
instruction  can  never  be  passed  to  the  TMRA  or  it  will  be  fetched  and  executed  at  point 
12.  If  so,  the  new  Jump  instruction  at  point  10  becomes  useless. 

The  Interrupt  inserts  the  new  Jump  instruction,  C8OOO816,  one  clock  after  point  9. 
Therefore,  it  takes  three  clock  cycles  to  have  the  new  program  counter  used  after 
F8OOOO16  is  seen  by  Interrupt.  The  operation  code  from  address  3Ai6  to  3Ci6  in  Figure  62 
will  not  be  implemented  since  the  ReconcUer  wants  to  clean  the  pipeline  before  the  TMR 
goes  back  to  normal  operation.  So  point  1 1  in  the  simulation  is  where  the  ISR  stops.  At 
this  time,  both  Reconciler  and  Interrupt  are  already  back  to  normal  states.  The  TMR 
goes  back  to  normal  operation  at  point  12. 

Doing  exactly  the  same  instruction  set  again  from  address  O816  to  IO16  in  Figure 
62  proves  the  error  in  RAM  B  has  been  corrected.  No  error  is  reported  and  the  ISR  is  not 
triggered  again  at  point  13  in  Figure  66. 

A  complete  ISR  should  store  all  contents  of  registers  to  memory  and  reload  them 
back  to  the  original  registers.  Inconsistent  data  between  the  three  processors  should  van¬ 
ish.  The  ISR  shown  in  Figure  62  is  not  complete  in  order  to  keep  the  simulation  simple. 
Generally  speaking,  the  ISR  should  not  overwrite  the  original  data.  A  temporary  memory 
location  needs  to  be  specified  for  storing  and  reloading  purposes  in  the  ISR.  The  simula¬ 
tion  in  this  design  of  overwriting  the  original  data  just  proves  the  function  of  the  error 
correction. 
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Figure  66.  Simulation  of  the  Full  Design  without  ESSD  (continued) 


C.  ERROR  ANALYSIS 

The  analysis  of  the  error  in  this  simulation  is  quite  easy  since  the  data  portion  is 
the  only  part  that  needs  to  be  checked.  Figure  67  shows  the  way  to  check  the  error. 

At  point  5  in  the  simulation,  the  cid  l  is  OO6E8OOOOOOOO16  and  the  cid  O  is  all 
zero.  A  zoom-in  on  point  5  is  shown  in  Appendix  A,  section  K.  It  can  be  quickly  identi¬ 
fied  as  an  error  from  the  second  processor.  Comparing  the  inconsistent  portion  of  the 
data  with  cid  data  shows  that  they  have  the  same  pattern  which  demenstrates  that  the  er¬ 
ror  report  in  this  design  is  correct. 
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Figure  67.  Error  Analysis  for  the  Full  Design 


D.  CHAPTER  SUMMARY 

It  is  exciting  to  see  that  this  full  design  works  in  simulation.  The  three  KDLX 
processors  work  in  parallel  and  the  design  functions  as  desired.  Confusion  on  how  Inter¬ 
rupt  or  Reconciler  works  should  have  been  cleared  up  by  the  material  in  this  chapter. 

The  program  counter  is  not  latched  properly  in  Figure  59,  but  works  perfectly  in  the  full 
design.  The  timing  issues  of  the  simulation  arise  again.  Changing  the  way  to  latch  the 
program  counter  in  the  Interrupt  to  make  it  work  in  Figure  59  may  cause  the  simulation 
of  the  full  design  to  fail. 

The  last  component  for  a  complete  TMR  design  is  the  Error  Syndrome  Storage 
Device  (ESSD).  This  is  a  device  used  to  store  error  syndromes  for  future  analysis.  The 
full  design  with  ESSD  will  be  introduced  in  the  next  chapter. 
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IX.  THE  FULL  DESIGN  WITH  ESSD 


After  designing  and  simulating  different  components,  the  TMR  design  is  almost 
completed.  In  the  previous  chapter,  it  has  been  shown  that  the  voters  are  able  to  report 
and  locate  an  error  when  it  occurs.  Errors  on  different  buses  will  be  reported  by 
cid_l(50:0),  cid_0(50:0),  err(50:0),  and  v_err(50:0).  The  pattern  generated  for  an  error 
on  these  buses  is  called  the  error  syndrome. 

A  space  system  like  CFTP  will  leave  the  earth  for  a  long  time.  It  is  desired  to 
have  some  kind  of  device  to  collect  the  error  syndrome  whenever  an  error  occurs.  The 
error  syndrome  can  be  used  to  analyze  the  health  of  the  system  or  help  understand  the 
space  environment  for  a  system  on  orbit.  If  the  same  error  is  generated  several  times,  it 
can  be  assumed  that  a  certain  device  is  defective  or  deviant.  The  solution  may  be  to  re¬ 
program  the  FPGA  or  reset  the  system.  The  ESSD  is  the  device  designed  to  collect  error 
syndromes.  In  order  to  be  able  to  download  this  data  after  a  period  of  time,  the  ESSD  has 
to  store  the  error  syndromes  to  memory. 

A,  THE  FUNCTION  OF  ESSD 

Simulation  for  the  full  design  without  ESSD  was  introduced  in  the  previous  chap¬ 
ter.  Therefore,  the  functions  of  ESSD  are  to  store  the  error  syndromes  and  where  they  are 
located  in  the  system.  The  ESSD  is  designed  pretty  much  following  the  concept  of  build¬ 
ing  the  Interrupt.  It  is  a  state  machine  coded  in  VHDF  and  runs  in  double  speed,  that  is 
in  synchronization  with  the  memory  clock.  It  has  to  run  in  double  speed  in  order  to  work 
with  errors  generated  in  either  half  of  the  KDFX  clock  cycle.  Because  the  ISR  will  be 
triggered  when  an  error  occurs,  choices  for  where  ESSD  is  to  be  implemented  are  before, 
after  or  sometime  within  the  ISR. 

Halting  normal  operation  is  the  last  choice  since  the  ISR  is  already  designed  to  do 
that.  It  is  reasonable  not  to  interrupt  the  normal  operation  unless  absolutely  necessary. 
Too  many  interruptions  may  decrease  the  performance  of  a  system  or  cause  the  program 
to  lose  track  of  the  instruction  sequence.  Due  to  these  reasons,  the  ESSD  is  implemented 
in  the  ISR  instead  of  triggering  another  interrupt  routine  somewhere  in  normal  operation. 
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To  minimize  the  impact  on  ISR,  the  ESSD  is  designed  to  start  right  before  the  first 
instruction  in  ISR  begins.  The  two  NOPs  following  the  TRAP  instruction  are  a  good 
starting  point  for  ESSD  since  the  pipeline  is  cleaned  and  no  useful  instruction  is  execut¬ 
ing.  Consolidating  all  of  the  concepts  above,  the  state  machine  for  ESSD  is  constructed 
as  Figure  68  and  its  VHDL  code  is  in  Appendix  C,  section  D. 


Figure  68.  State  Machine  of  ESSD 


The  first  eight  states  are  very  similar  to  the  states  in  Interrupt.  This  is  because  the 
ESSD  has  to  wait  until  two  NOPs  are  inserted.  The  LatchState_A  latches  the  program 

counter,  the  data  address,  and  the  51 -bit  data  on  the  cid_0  and  cid  l  buses.  The  Stall- 
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State  stalls  KDLX  in  order  to  start  storing  the  latched  error  syndromes.  The  ESSD  stores 
data  to  memory  as  a  stack  which  starts  at  the  bottom  and  runs  to  the  top.  For  simplicity 
and  explanation  purpose,  we  use  address  0059i6  as  the  starting  point  and  store  data  from 
the  least  significant  bit  to  the  most  significant.  This  function  is  illustrated  in  Figure  69. 
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Figure  69.  Function  of  ESSD  Storing 


Each  data  word  in  memory  is  24-bits  wide  so  a  51 -bit  data  syndrome  takes  three 
clock  cycles  to  store.  The  most  significant  three  bits  of  cid_0  and  cid_l  are  stored  with 
21  zeros  ahead.  A  counter  is  used  internal  to  ESSD  to  track  the  memory  locations.  The 
next  error  syndrome  will  start  at  address  5 1 16.  States  from  StoreStateO_A  to  Store- 
State  _pc  implement  the  actions  described  here.  During  this  period,  all  of  the  processors 
are  stalled  and  the  memory  is  controlled  by  ESSD.  The  last  state  is  the  BackState  which 
releases  the  processors  to  start  the  ISR. 


109 


The  ESSD  runs  at  twice  the  speed  of  the  TMRA  but  states  after  the  NopStatel_B 
are  not  doubled  as  the  other  state  machines  do.  Because  the  ESSD  and  the  memory  are 
both  in  double  speed,  one  memory  access  can  occur  in  every  ESSD  state.  Therefore, 
states  between  StoreStateO_A  and  BackState  do  not  need  to  be  duplicated.  The  Interrupt 
and  Reconciler  stop  functioning  when  KDLX  is  stalled.  The  schematic  symbol  of  ESSD 
is  shown  in  Figure  70. 
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Figure  70.  Schematic  Symbol  of  ESSD 


Input  signals  at  the  left  side  are  used  for  latching  data  from  the  buses.  Output  sig¬ 
nals,  sel_addr(15:0),  sel_s(23:0),  and  sel_wr  are  used  to  switch  muxes  in  order  to  insert 
data  on  addr_s(15:0),  ess(23:0),  and  wr_s,  respectively.  The  stall_s  goes  low  to  stall 
KDLX  when  error  syndromes  are  ready  to  be  stored. 

B,  THE  FULL  DESIGN  WITH  ESSD 

1.  Schematic 

The  schematic  for  the  full  design  with  ESSD  is  shown  in  Figure  7 1 .  Comparing 
with  Figure  61,  the  ESSD  is  added  at  the  bottom  right  and  all  incoming  or  outgoing  buses 
are  intercepted  with  muxes.  The  ESSD  obviously  takes  over  RAM?,  once  it  starts  to  store 
error  syndromes.  Three  muxes  at  the  input  side  of  RAM?  are  used  to  insert  the  data  ad¬ 
dress,  data  and  write  signal.  The  other  three  muxes  on  the  output  buses  of  RAM?  are  used 
to  intercept  any  unrelated  data  to  Reconciler  while  storing  the  error  syndromes. 

Two  big  latches  called  latchSl  are  sitting  on  the  cid_0  and  cid_l  buses  ahead  of 
the  ESSD.  This  part  is  coded  in  VHDL  and  is  necessary  for  this  design.  It  latches  data 
when  err  is  high  and  keeps  the  latched  data  until  the  next  error  is  detected.  Therefore,  the 
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ESSD  can  capture  cid_0  and  cid_l  whenever  it  wants  beeause  this  data  is  available  and 
stable  on  the  bus.  More  explanation  of  how  it  functions  and  why  it  is  vital  in  this  design 
will  be  deseribed  in  the  simulation  diseussion. 
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Figure  7 1 .  Schematic  of  the  Full  Design  with  ESSD 

112 


2, 


Simulation 


Fewer  signals  are  monitored  here  than  with  the  full  design  in  the  previous  chapter, 
since  the  test  bench  is  almost  identical  except  for  a  few  extra  instructions  for  checking 
stored  error  syndromes  in  memory.  Functions  of  the  TMRA,  Interrupt  and  Reconciler  in 
the  full  design  without  ESSD  have  been  described  so  this  simulation  just  shows  how  the 
ESSD  works.  Important  signals  and  all  buses  on  the  ESSD  are  monitored  in  the  simula¬ 
tion  shown  in  Figures  72  and  74.  This  simulation  ignores  most  identical  parts  introduced 
in  the  previous  chapter.  Only  the  important  functions  of  the  ESSD  are  shown  for 
explanation. 


point  1  point  4 


Figure  72.  Simulation  of  the  Full  Design  with  ESSD 


In  Figure  72,  five  clocks  are  listed.  The  Reconciler,  Interrupt  and  ESSD  all  work 
in  parallel  so  the  time  constraints  for  clk_ir  and  clk_s  are  identical.  The  new  clock,  clk_l. 
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for  latchSl  needs  to  run  at  double  speed,  and  it  has  to  be  stable  before  the  ESSD  is  ready. 
Because  of  this,  the  latchSl  has  less  setup  and  hold  time  comparing  with  the  ESSD. 

As  before,  the  error  is  caught  at  point  1  and  cid_l,  cid_0  indicate  where  the  error 
is.  One  needs  to  know  that  cid_l  and  cid_0  are  output  data  of  latchSl .  Unlike  the  simu¬ 
lation  in  previous  chapter,  data  on  cid_l  and  cid_0  show  up  at  point  2  and  are  latched  un¬ 
til  the  next  error  is  reported  in  normal  operation.  The  ESSD,  therefore,  is  able  to  store 
these  two  data  when  state_s  is  02i6. 

The  most  important  reason  for  using  latchSl  is  to  make  the  data  stable  on  the  bus. 
The  zoom  in  at  point  5  in  Figure  63  is  shown  in  Figure  73.  The  data  of  cid_l  and  cid  O  is 
available  after  the  memory  clock  cycle  and  becomes  unstable  before  the  next  rising  edge 
of  the  Interrupt  or  Reconciler  clock  cycle.  Because  the  ESSD  is  running  exactly  the  same 
clock  speed  as  the  Interrupt  and  Reconciler,  both  cid_l  and  cid  O  have  to  be  available 
until  the  next  rising  edge  of  the  Interrupt  (or  Reconciler)  clock  in  order  to  be  latched  cor¬ 
rectly  for  the  ESSD.  Due  to  this  reason,  the  latchSl  is  designed  to  keep  the  data  stable 
and  the  ESSD  thus  can  latch  it  at  any  state  before  storing  the  error  syndromes. 
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Figure  73.  Detail  Timing  at  point  5  in  previous  simulation 


Back  to  Figure  72,  point  3  is  the  first  instruction  fetched  in  the  ISR.  At  the  same 
time  the  KDLX  is  fetching  this  instruction,  the  ESSD  triggers  stall_s  at  point  4  to  stall  the 
processors.  In  the  next  clock  cycle,  the  muxes  are  switched  to  zeros  and  0059i6  appears 
on  the  address  bus  to  the  RAMs. 
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Figure  74.  Simulation  of  the  Full  Design  with  ESSD  (eontinued) 


Following  the  algorithm  explained  in  Figure  69,  the  bus  ess  at  point  5  proves  this 
function  works.  Once  the  ESSD  finishes  at  point  6,  it  gives  all  of  the  buses  back  and  re¬ 
leases  the  processors.  The  first  instruction  of  the  ISR  starts  in  the  next  clock  cycle. 

Extra  instructions  in  the  RAMs  are  for  loading  error  syndromes  stored  in  memory 
back  to  the  registers  for  checking  purposes.  These  instructions  start  at  point  7  and  the 
output  data  at  point  8  proves  that  all  values  are  stored  correctly. 

C.  CHAPTER  SUMMARY 

All  components  for  a  complete  design  have  been  introduced.  The  reason  for  not 
discussing  the  ESSD  until  this  chapter  is  to  simplify  the  simulation.  There  were  too  many 
things  that  needed  to  be  explained  in  the  simulation  result  if  the  ESSD  is  not  described 
separately.  This  would  make  the  whole  simulation  look  complicated  and  may  not  em¬ 
phasize  the  importance  of  the  ISR.  Introducing  the  ESSD  separately  means  that  the  func¬ 
tions  of  the  Reconciler,  Interrupt,  and  ESSD  are  shown  clearly  in  all  simulations. 
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Not  a  conceptual  design,  this  full  design  was  simulated  and  eheeked.  Design  of 
these  eomponents  ean  be  improved  and  more  information  is  needed  for  a  better  perform- 
anee  of  the  TMR  system.  These  topics  for  follow-on  research  will  be  diseussed  in  the 
next  ehapter. 
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X.  CONCLUSIONS  AND  FOLLOW-ON  RESEARCH 


This  thesis  has  described  the  design  of  a  premiere  TMR  design  on  an  FPGA  for 
the  CFTP.  Major  components  have  been  defined  in  previous  theses  but  most  of  them  had 
to  be  redesigned  due  to  more  understanding  of  the  KDLX  processor.  Each  component 
was  simulated  to  prove  its  function.  Some  timing  issues  were  discussed  when  different 
components  were  connected  with  each  other.  The  full  design  has  proved  the  ability  to  de¬ 
tect  and  correct  an  SEU  in  simulation  as  well. 

A,  OVERVIEW 

The  TMR  Assembly  consists  of  three  KDEX  processors  and  voters  in  order  to  de¬ 
tect  and  correct  errors.  A  majority  voter  can  only  handle  one  error  per  time.  Since  the 
TMR  Assembly  has  several  voters  in  it,  it  is  able  to  report  errors  on  different  signals  si¬ 
multaneously.  Eor  example,  cid_l  and  cid_0  buses  of  the  TMRA  can  identify  errors  on 
the  program  counter  and  data  at  the  same  time.  The  processor  causing  errors  on  the  pro¬ 
gram  counter  may  not  be  the  same  one  that  generates  errors  on  data. 

In  order  to  coordinate  memory  access,  the  ReconcUer  is  built  to  consolidate  the 
Harvard  and  Von  Neumann  architectures.  It  runs  twice  as  fast  as  the  KDLX  clock  cycle 
and  has  instruction  memory  access  first  followed  by  the  data  memory  access  second. 

This  component  purely  implements  read  and  write  access  with  memory  and  does  not  re¬ 
late  directly  to  error  detection  or  correction.  The  Interrupt  provides  an  ISR  to  correct  any 
inconsistency  in  registers  between  the  three  processors.  This  unit  is  triggered  when  an  er¬ 
ror  is  found  by  the  TMRA.  If  an  error  is  caused  somewhere  on  the  bus  but  not  inside  reg¬ 
isters,  the  ISR  will  still  be  triggered  but  no  error  will  be  found.  An  error  syndrome  re¬ 
cords  the  program  counter,  the  memory  address,  and  any  inconsistent  bits  on  data,  ad¬ 
dress,  program  counter,  read,  write  and  program  read  in  cid  buses.  This  information  is 
latched  in  ESSD  and  will  be  stored  to  memory  during  the  ISR.  Analyzing  error  syn¬ 
dromes  can  help  a  designer  to  correct  or  fix  the  current  design. 
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B, 


CONCLUSIONS 


A  simple  flow  chart  in  Figure  75  illustrates  the  overall  procedure  to  correct  an  er¬ 
ror  in  TMR.  The  role  of  each  component  in  the  full  design  can  be  understood  clearly. 
The  Interrupt  is  generated  for  error  correction  purpose  only  and  the  ESSD  is  for  storing 
error  syndromes  only. 


Figure  75.  Flowchart  of  Error  Correction  for  TMR  design 

A  reprogrammable  space  device  such  as  CFTP  has  a  great  potential  for  the  future. 
The  TMR  on  an  FPGA  functions  as  a  SOC  which  saves  space  on  board  and  offers  the 
flexibility  of  modification.  Utilizing  the  TMR  design  with  some  other  features  makes  the 
CFTP  act  as  an  error-free  device.  Its  powerful  feature  of  reconfigurability  widens  its  us¬ 
age  in  missions  and  lets  the  state-of-the-art  technology  be  applied  to  many  applications. 
C.  FOLLOW-ON  RESEARCH 

A  premiere  functioning  TMR  design  is  complete.  This  circuit  was  simulated  and 
proved  on  software.  It  is  possible  to  instantiate  this  design  onto  a  development  board  to 
verify  its  function.  Before  doing  that,  some  modifications  need  to  be  done.  Performance 
of  each  component  can  be  improved  as  well.  Furthermore,  using  a  faster  soft-core  proc¬ 
essor  to  speed  up  the  overall  performance  of  the  TMR  is  inevitable. 
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1,  Modification  on  Current  Design 

Most  components  like  Reconciler,  Interrupt  and  ESSD  are  essentially  state  ma¬ 
chines  eoded  in  VHDL.  It  is  possible  to  have  these  three  in  one  big  state  maehine  sinee 
they  all  run  in  double  speed.  One  needs  to  have  a  elear  mind  on  the  different  funetions  of 
the  different  eomponents  in  order  to  do  this.  Debugging  this  kind  of  big  state  machine 
needs  to  be  earefully  done  sinee  any  modification  on  one  state  may  affeet  functions  on 
other  states.  On  the  other  hand,  there  are  several  different  ways  to  eode  a  component. 
Other  methodologies  sometimes  are  better  than  using  a  state  maehine  depending  on 
eharaeteristies  of  these  different  eomponents. 

A  voter  error  is  not  considered  in  this  thesis  due  to  time  eonstraints.  This  kind  of 
error  does  not  need  to  trigger  the  ISR.  When  a  voter  votes  incorreetly,  the  output  is  not 
trustful.  The  data  ean  be  either  disearded  or  re-voted  based  on  the  situation.  The  ESSD 
may  need  to  be  revised  so  as  not  to  save  all  error  syndromes  in  order  to  save  memory 
spaee. 

The  memory  seleeted  for  the  simulation  is  based  on  the  availability  of  the  ISE 
software.  If  possible,  a  real  Von  Neumann  arehiteeture  memory  should  be  built.  Modifi- 
eations  on  the  TMRA  and  Reconciler  will  be  neeessary  at  that  time.  The  real  environment 
on  the  development  board  must  be  eonsidered  before  these  modifications.  This  avoids 
duplicate  work  and  makes  it  possible  to  eompare  the  simulation  result  on  software  with 
the  one  on  hardware. 

An  SEU  ean  oecur  anywhere  in  the  TMR  design.  More  issues  need  to  be  solved 
if  this  error  oeeurs  on  the  Reconciler,  Interrupt  or  ESSD.  Inereasing  the  reliability  also 
inereases  the  probability  of  having  an  SEU.  The  trade-off  between  these  eonditions 
needs  more  diseussion. 

2,  Faster  Processors 

Several  requirements  are  eonsidered  when  searehing  for  a  faster  proeessor.  Eirst, 
The  new  proeessor  has  to  be  faster  than  the  eurrent  16-bit  RISC  KDEX.  Seeond,  it  has  to 
be  a  soft-eore  proeessor.  Third,  it  needs  to  be  eompatible  with  Xilinx  Virtex  XCV800 
HQ240  EPGA  seleeted  for  the  CETP.  Other  features  sueh  as  using  eache  or  Harvard  ar- 
ehitecture  ean  be  reeonsidered. 
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Many  soft-core  processors  nowadays  use  cache  to  improve  their  performance 
even  though  it  is  possible  to  have  an  SEU  in  it.  Detecting  and  correcting  an  SEU  in  a 
cache  cannot  use  the  same  method  as  with  the  registers.  The  contents  of  the  caches  need 
to  be  reloaded  by  some  method.  Study  of  the  SEE  on  a  Pentium®5  III  processor  proves 
that  utilizing  cache  in  different  ways  can  change  the  testing  result  dramatically  [12]. 
Therefore,  it  is  possible  to  take  advantage  of  cache  without  increasing  the  probability  of 
having  an  error,  and  consideration  of  future  processors  should  include  ones  with  cache. 

Using  a  Von  Neumann  architecture  processor  would  simplify  the  TMR  design. 
The  Reconciler  can  be  removed  and  less  control  in  TMRA  are  needed  for  the  data  bus. 


Table  21  lists  some  candidate  commercial  processors  that  are  currently  available. 


Commercial  Processors 

Company 

Processor 

Architecture 

Features 

Xilinx 

MicroBlaze 

32-bit  RISC 

1 .  No  cache 

2.  Harvard  bus 

ARM 

ARM7TDME 

32-bit  RISC 

1 .  Most  have  cache 

2.  Von  Neumann  bus 

3.  Hard  core 

MIPS 

MIPS64 

5Kc(5Kf) 

64-bit  RISC 

1 .  Programmable  cache  0-64KB 

2.  Co-processor  interface 

3.  Eloating-point  pipline 

4.  Hard  core 

MIPS 

MIPS64  20Kc 

64-bit  RISC 

1.  32KBcaches 

2.  Superscalar 

3.  Hard  core 

Sandcraft 

SR71010B 

64-bit  RISC 

1.  MIPS64  based 

2.  El  32KB  cache 

Tensilica 

Xtensa 

32-bit  RISC 

1 .  Eocal  data  and  instruction  caches 

Altera 

Nios 

32-bit  RISC 

1.  Instruction  master  is  a  16-bit  wide,  la¬ 
tency-aware  Avalon  bus  master 

2.  Configurable  cache  size 

ARC 

ARCtangent-A4 

32-bit  RISC 

1 .  Processor  can  be  configured  with  Har¬ 
vard  bus  architecture  (separate  instruc¬ 
tion/data  buses)  or  a  von  Neumann  bus 
architecture  (unified  instruction/data 
buses) 

2.  User-configurable  instruction  and  data 
cache 

Table  21.  Commercial  Soft-Core  Processors 


5  Pentium  is  a  registered  trademark  of  Intel  Corporation. 


120 


Some  processors  have  configurable  cache  which  gives  the  user  some  flexibility. 
The  advantage  and  disadvantage  between  a  soft-core  and  a  hard-core  processor  has  been 
described  in  Chapter  I  so  no  hard-core  processors  are  considered.  Candidates  for  the 
TMR  are  MicroBlaze,  SR71010B,  Xtensa,  Nios,  and  ARCtangent-A4. 

Commercial  processors  are  always  expensive  because  of  the  proprietary  issues. 
Sometimes  these  processors  come  with  their  own  development  kit  which  makes  imple¬ 
mentation  on  other  software  impossible.  Part  of  the  design  of  a  commercial  processor  is 
sometimes  protected  by  the  company  and  not  accessible  for  the  user.  Even  though  revis¬ 
ing  a  processor  is  not  always  required,  studying  source  code  is  a  good  and  fast  way  to  un¬ 
derstand  the  processor  itself  On  the  other  hand,  information  of  these  commercial  proces¬ 
sors  is  limited  since  only  the  data  sheet  on  the  Internet  can  be  found  most  of  the  time. 

Sometimes  people  share  their  invention  or  modification  of  cores  with  the  public. 
These  cores  may  or  may  not  be  fully  tested  and  usually  the  designer  is  looking  for  other 
people  to  test  it.  These  cores  are  called  OpenCores.  OpenCores  are  free  and  can  be  eas¬ 
ily  downloaded  from  the  Internet.  The  disadvantage  of  using  OpenCores  is  that  they  are 
hard  to  use.  Some  designers  do  not  describe  their  design  in  detail  and  development  tools 
vary  from  different  designers.  People  post  their  questions  on  the  website  and  hope  some¬ 
one  will  answer  it.  Therefore,  there  is  no  customer  support  like  the  commercial  proces¬ 
sors.  Some  Opencores  are  collected  in  Table  22. 

Some  information  is  not  complete  due  to  the  lack  of  description  by  designers  or 
other  users.  These  cores  do  not  have  many  restrictions  and  can  be  modified  if  desired. 
Based  on  the  information  found,  the  SPARC  and  RISC  RIOOO  are  very  common  proces¬ 
sors.  The  RISC  RIOOO  has  been  tested  and  successfully  ran  a  video  image  program. 
Many  devices  are  also  compatible  with  this  processor.  The  RISC  RI200  is  almost  an 
identical  processor  with  RIOOO  except  for  the  cache  inside.  The  Yellow  Star  which  is  ac¬ 
tually  the  MIPS32  R3000  processor  is  known  as  a  very  powerful  processor.  It  has  been 
tested  by  many  users  as  well. 
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OpenCores 

Architecture 

Name 

Eeatures 

SPARC  V8 

LEON  VHDL 

32  bit 

1 .  AMB  A  AHB  and  APB  on-chip  buses 

2.  Data  cache  is  a  direct-mapped  cache  configurable  to 
1-64  kbyte 

SPARC  V7 

ERC32 

32  bit 

1 .  A  radiation-tolerant  processor  developed  for  space 
applications 

2.  Two  platforms  are  supported;  SPARC  Solaris-2.5.1 
(or  higher),and  x86  linux  (libc5) 

3.  VHDL  model  runs  on  Unix  systems 

RISC 

OpenRisc  RIOOO 
32  bit 

1.  Tested  on  Xess  XSV800  and  Elextronics  Semicon¬ 
ductor  development  boards 

RISC 

OpenRisc  R1200 
32  bit 

1.  Tested  on  Xess  XSV800  and  Elextronics  Semicon¬ 
ductor  development  boards 

2.  cache 

RISC 

Yellow  Star 
(MIPS32  R3000) 
32  bit 

1 .  Capable  of  executing  32bit  instructions  based  on  the 
MIPS  R3000  microprocessor  instruction  set  and  has 
been  tested  running  large  blocks  of  compiled  C  code. 

2.  Eully  functional  and  compatible  interrupt  system.  Can 
handle  all  exceptions  cleanly  and  correctly. 

3.  On-chip  cache  control  and  Memory  Management  Unit 

RISC 

Rise  16f84 

1.  The  "riscl6f84  clk2x.v"  core  has  been  coded  com¬ 
pletely,  synthesized  and  tested  for  correct  operation 
(and  debugged!)  inside  a  Xilinx  XC2S200  EPGA 

RISC 

Plasma 

1 .  Support  interrupts  and  all  MIPS  I(TM)  user  mode  in¬ 
structions  except  unaligned  load  and  store  operations 
(which  are  patented)  and  exceptions  which  can  be  eas¬ 
ily  avoided. 

2.  Tested  on  an  Altera  EPGA  running  at  16.5  MHz  (syn¬ 
thesized  for  29.8  MHz) 

3.  Currently  running  on  an  Altera  EP20K200EPC484- 
2X  EPGA  and  a  Xilinx  EPGA 

Table  22.  OpenCores 


These  OpenCores  are  tested  and  proved  with  certain  FPGAs.  In  order  to  use  these 
processors  in  the  TMR  design,  more  study  and  research  on  source  codes  are  required. 
Finally,  they  will  need  to  be  tested  and  simulated  on  the  ISE  software  before  any  design 
work  related  to  the  TMR. 
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APPENDIX  A:  SCHEMATICS 


Appendix  A  contains  all  schematics,  test  benches  and  simulation  results  of  the 
components  in  this  thesis.  Simple  schematic  symbols  are  introduced  as  figures  and  are 
not  included  here.  Features  and  settings  of  each  component  and  test  bench  are  briefed  as 
well.  The  long  test  bench  is  chopped  into  pieces  and  only  the  important  parts  are  shown. 
Sometimes  a  different  expression  is  used  in  order  to  explain  how  a  component  will  be 
tested. 


The  simulation  result  is  always  shown  completely.  Important  parts  that  need  to  be 
explained  are  duplicated  or  modified  in  contents.  All  values  used  in  the  test  bench  and 
the  simulation  result  are  hexadecimal  and  RO  is  always  zero. 

A.  24-BIT  MEMORY 
1.  Schematic 


This  memory  is  a  RAM.  It  is  triggered  at  the  rising  clock  edge.  Both  write  en¬ 
able  (i.e.,  WE)  and  memory  enable  (i.e.,  EN)  pins  are  active  low.  Default  value  of  this 
memory  is  zero. 


I  addr(7:0)> - 'ADDR(7:0)  DOUT(23:0) 

data  in(23:0^ - — ^DIN(23:0) 

Lwe,. - WE 

I  enable_m  - EN 

[elk; - CLK 


-|  data  out(23:0))’ 


2,  Test  Bench 


This  test  bench  was  originally  in  a  single  row.  It  is  cut  into  two  rows  in  order  to 
fit  the  paper  size.  The  vertical  line  at  time  2100  ns  is  the  stop  point  of  the  simulation. 
Clock  high  time  and  low  time  is  50  ns.  Input  setup  time  and  output  valid  delay  is  10  ns. 
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Simulation  Result 


/testbench/dk 

/testbench/addr 

/testbench/data_in 

/testbench/enable.m 

/testbench/we 

/testbench/data  out 

00 

oi  [02 

03  l04 

05  l06 

07  J08 

09  loo 

01  I02 

03  l04 

05  I06 

07  >08 

09 

000000  1000047 

ooooocloooosi 

000056  lOOOOSB 

000060  looooes 

00006*  100006F 

000074 

1 

1 

1 

1 

8 

8 
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IOOOO6A  IOOOO6F 

to00074  I0OOO47 

I00004C  loooosi 

>000056  I00005B 

>000060  >  000065 

IOOOO6A  IOOOO6F 

>000074 

B.  KDLX  WITHOUT  MEMORY 

1,  Schematic 


dix 


2,  Test  Bench 


The  data  bus  is  high  impedance.  Two  values  are  offered  at  clock  5  and  6  for 
KDLX  to  load  into  registers.  Clock  high  time  and  low  time  is  50  ns.  Input  setup  time 
and  output  valid  delay  is  10  ns. 
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Simulation  Result 


1,  Schematic 


The  instruction  memory  at  the  left  side  is  a  ROM.  The  data  memory  at  the  right 
side  is  a  RAM.  Data  memory  is  pre-configured  with  0003 16.  Both  memories  are  trig¬ 
gered  at  the  rising  clock  edge. 
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2. 


Test  Bench  of  Instruction  Set 


For  the  processor,  clock  high  time  and  low  time  is  50  ns;  input  setup  time  and  out¬ 
put  valid  delay  is  10  ns.  For  memories,  all  timing  settings  are  half  of  the  processor  clock. 
The  bi-directional  bus  is  high  impedance. 


Nothing  special  is  needed  in  the  test  bench  thus  only  the  first  and  last  parts  are 
shown  here.  The  KDLX  is  reset  and  memories  are  enabled  at  time  200  ns.  Since  the  in¬ 
struction  is  configurable,  the  test  benches  for  all  instructions  sets  are  the  same. 
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3.  Tables  and  Simulation  Results  of  Instruction  Sets 


a.  Implementation  Table  of  Instruction  Set  I 


Instruction  (operation  symbol) 

Opcode 

Expected  Value 

LW 

Rl^Mem(R0+03) 

440103 

SW 

Rl^Mem(R0+08) 

450108 

0003 

LW 

R2^Mem(R0+04) 

440204 

SW 

R2^Mem(R0+09) 

450209 

0003 

ADD 

R1+R2^R3 

011320 

SW 

R3^Mem(R0+0D) 

45030D 

0006 

ADDI 

Rl+ext(F9)^R4 

4114F9 

SW 

R4^Mem(R0+0E) 

45040E 

EEEC 

ADDUI 

R1+(0A)  ^R5 

21150A 

SW 

R5^Mem(R0+0F) 

45050E 

OOOD 

AND 

R1*R3^R6 

091630 

SW 

R6^Mem(R0+10) 

450610 

0002 

ANDI 

R4*(FD)^R7 

2947ED 

SW 

R7^Mem(R0+ll) 

450711 

OOEC 

111 


Instruction  (operation  symbol) 

Opcode 

Expected  Value 

LHI 

R8^EE||(0)^ 

0808EE 

SW 

R8^Mem(R0+12) 

450812 

EEOO 

OR 

R1+R3^R9 

0A1930 

SW 

R9^Mem(R0+13) 

450913 

0007 

ORI 

R1+(E0)^R10 

2A1AE0 

SW 

R10^Mem(R0+14) 

450A14 

00E3 

SEQ 

R1=R2^R11=1 

181B20 

SW 

Rll^Mem(R0+15) 

450B15 

0001 

SEQ 

R1^R3^R12=0 

181C30 

SW 

R12^Mem(R0+16) 

450C16 

0000 

SEQI 

R1=(0003)^R13=1 

581D03 

SW 

R13^Mem(R0+17) 

450D17 

0001 

SEQI 

R1^(0004)^R14=0 

581E04 

SW 

R14^Mem(R0+18) 

450E18 

0000 

SEE 

R4^R^=(ooo3)^r15 

114E20 

SW 

R15^Mem(R0+19) 

450E19 

EEEO 

SEEI 

R4^(ooo3)^R3 

514305 

SW 

R3^Mem(R0+lA) 

45031 A 

EE80 

SRA 

R4^Ri^(ooo3)^R5 

134510 

SW 

R5^Mem(R0+lB) 

45051B 

EEEE 

SRLI 

R4^(ooo3)^R6 

524603 

SW 

R6^Mem(R0+lC) 

45061C 

IEEE 

SUBI 

R8-ext(7B)^R7 

43877B 

SW 

R7^Mem(R0+lD) 

45071D 

EE85 

XOR 

R9©R10^R11 

0B9BA0 

SW 

Rll^Mem(RO+lE) 

450B1E 

00E4 

128 
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Simulation  Result  of  Instruction  Set  1 


129 


/testbench/clkjj 
/testbench/dk_rafn_rom 
/testt)ench/en_rom 
/testbench/ en_ram 
/testbench/ reset_p 
/testbefxih/stall  _p 
/testbench/instr _pass 
/testbench/out.mem 
/testbench/prog_rd _p 
/testbench/ read_p 
/testbench/ wTite_p 
/testbenctvciata_p 


/testt)ench/clk_p 
/testbench/dk_ram_rom 
/testbench/en_rom 
/testbench/en_ram 
/testbench/reset_p 
/testbench/ stall_p 
/te5tbench/instr_pass 
/testbench/out_mem 
/testbench/prog_rd_p 
/testbench/ read_p 
/testbench/ write_p 
/testbenctv<lata_p 


I  I  L 


I  I  1 


U  LI  LI  I 


LI  I  I  I  I  I 


l4m>HQ  I4405II  34406:2  3440713  3440S:4_  !4409:5~)440A16~l440B17~l44IX18~l440pi9~|4‘:OElA  (44GFI^  344C11C  34402  ID~t44031E~l 000000 


OiKUlQQOO  JFFFC  30XD  3Gv02  3C0FC  3fFOO  Jo007  IoOF3  loOOl  loOOO  lOOOl  300-00  3FFE0  ;fF60  JfFFF  IlFff  IfES5  J 


I  I  I 


I  I 


{ I.I  ;i  >r  ') 

rLTLTLTL 

\  [ooeo)  (oott 

rLTLTLTL 

1  (ocfc^ 

rLTLTLTL 

1  (000?^  (0^ 

rLTLTLTL 

1  (OOOQ 

rLTLTLTLr 

000000 

00F410003 

J 

I  I 


-1°°°°)— (™1— 1™|1— 1^ 


130 


c. 


Tables  of  Registers  and  Memories  in  Simulation  1 


Instruction  Mem 

00 

2D 

45071 D 

01 

440103 

2E 

450B1E 

02 

440204 

2F 

000000 

03 

000000 

30 

000000 

04 

000000 

31 

000000 

05 

450108 

32 

450101 

06 

450209 

33 

450201 

07 

000000 

34 

450301 

08 

011320 

35 

450401 

09 

4114F9 

36 

450501 

OA 

21150A 

37 

450601 

OB 

000000 

38 

450701 

OC 

091630 

39 

450801 

OD 

45030D 

3A 

450901 

OE 

45040E 

3B 

450A01 

OF 

45050F 

3C 

450B01 

10 

450610 

3D 

450C01 

11 

2947FD 

3E 

450D01 

12 

0808FF 

3F 

450E01 

13 

0A1930 

40 

450F01 

14 

2A1AF0 

41 

000000 

15 

450711 

42 

000000 

16 

450812 

43 

000000 

17 

450913 

44 

4401 OD 

18 

450A14 

45 

44020E 

19 

181B20 

46 

44030F 

1A 

181C30 

47 

440410 

IB 

581 D03 

48 

44051 1 

1C 

581 E04 

49 

440612 

ID 

450B15 

4A 

440713 

IE 

450C16 

4B 

440814 

IF 

450D17 

4C 

440915 

20 

450E18 

4D 

440A16 

21 

114F20 

4E 

440B17 

22 

514305 

4F 

440C18 

23 

134510 

50 

440D19 

24 

524603 

51 

440E1A 

25 

450F19 

52 

440F1B 

26 

45031A 

53 

4401 1C 

27 

45051 B 

54 

44021 D 

28 

45061 C 

55 

44031 E 

29 

43877B 

56 

000000 

2A 

0B9BA0 

57 

000000 

2B 

000000 

58 

000000 

2C 

000000 

59 

000000 

Register 

00 

01 

0003 

02 

0003 

03 

000© 

FF80 

04 

FFFC 

05 

OOOO 

FFFF 

06 

0002 

1FFF 

07 

OOFG 

FE85 

08 

FFOO 

09 

0007 

10 

00F3 

11 

0004 

00F4 

12 

0000 

13 

0001 

14 

0000 

15 

FFEO 

Data  Mem 

00 

01 

02 

03 

04 

05 

06 

07 

08 

0003 

09 

0003 

OA 

OB 

OC 

OD 

0006 

OE 

FFFC 

OF 

OOOD 

10 

0002 

11 

OOFC 

12 

FFOO 

13 

0007 

14 

00F3 

15 

0001 

16 

0000 

17 

0001 

18 

0000 

19 

FFEO 

1A 

FF80 

IB 

FFFF 

1C 

1FFFF 

ID 

FE85 

IE 

00F4 

IF 

20 

21 

22 

23 

24 

25 

26 

27 

28 

29 

2A 
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d.  Implementation  Table  of  Instruction  Set  2 


Instruction  (pseudo  code) 

Opcode 

Expected  Value 

SGE 

R1>R3^R13=1 

191D30 

SW 

R13^Mem(R0+lE) 

450D1E 

0001 

SGE 

R15>R14^R9=0 

19E9E0 

SW 

R9^Mem(R0+20) 

450920 

0000 

SGEI 

R15>ext(E8)^R10=0 

59EAE8 

SW 

R10^Mem(R0+21) 

450A21 

0000 

SGEI 

R15>ext(E0)  ^Rll=l 

59EBE0 

SW 

Rll^Mem(R0+22) 

450B22 

0001 

SGT 

R4>R15^R6=1 

1A46E0 

SW 

R6^Mem(R0+23) 

450623 

0001 

SGT 

R15>R4^R7=0 

1AE740 

SW 

R7^Mem(R0+24) 

450724 

0000 

SGTI 

R15>ext(EE)^R8=0 

5AE8EE 

SW 

R8^Mem(R0+25) 

450825 

0000 

SGTI 

R15>ext(87)^R9=l 

5AE987 

SW 

R9^Mem(R0+26) 

450926 

0001 

SEE 

R1=R2^R10=1 

1B1A20 

SW 

R10^Mem(R0+27) 

450A27 

0001 

SEE 

R1<R13^R11=0 

IBIBDO 

SW 

Rll^Mem(R0+28) 

450B28 

0000 

SEEI 

Rl<ext(03)^R12=l 

5B1C03 

SW 

R12^Mem(R0+29) 

450C29 

0001 

SEEI 

Rl<ext(02)^R13=0 

5B1D02 

SW 

R13^Mem(R0+2A) 

450D2A 

0000 

SET 

R15<R1^R6=1 

1CE610 

SW 

R6^Mem(R0+01) 

450601 

0001 

SET 

R1<R15^R7=0 

1C16E0 

SW 

R7^Mem(R0+02) 

450702 

0000 

SETI 

Rl<ext(0D)^R8=l 

5C180D 

SW 

R8^Mem(R0+03) 

450803 

0001 

SETI 

Rl<ext(01)^R9=0 

5C1901 

SW 

R9^Mem(R0+04) 

450904 

0000 

SNE 

R1^R2^R10=0 

1D1A20 

SW 

R10^Mem(R0+05) 

450A05 

0000 

SNE 

R1^R15^R11=1 

IDIBEO 

SW 

Rll^Mem(R0+06) 

450B06 

0001 

SNEI 

Rl^ext(03)^R12=l 

581C03 

SW 

R12^Mem(R0+07) 

450C07 

0001 

SNEI 

R15^ext(El)^R13=0 

58EDE1 

SW 

R13^Mem(R0+08) 

450D08 

0000 

SRAI 

R3^(ooo6)^R6 

533606 

132 


Instruction  (pseudo  code) 

Opcode 

Expected  Value 

SW 

R6^Mem(R0+09) 

450609 

EEEE 

SRL 

R3^R^^(ooo3)^R7 

123720 

SW 

R7^Mem(R0+0A) 

45070A 

lEEO 

XORI 

R15©(8A)^R8 

2BF88A 

SW 

R8^Mem(R0+0B) 

45080B 

EE6A 

SUBUI 

R3-(80)^R9 

233980 

SW 

R9^Mem(R0+0C) 

45090C 

EEOO 

SUB 

R1-R3^R14 

031E30 

SW 

R14^Mem(R0+0D) 

450E0D 

0083 

e.  Simulation  Result  of  Instruction  Set  2 


/testbench/clk_p 

/testbench/clk_rarn_rom 

/testbench/en_rom 

/testbench/en_ram 

/testbench/ reset_p 

/testbench/stall_p 

rLTLTLTL 

rurjui 

rLTUlTL 

rLTLTLn 

rLTLnjT 

rLTLTLTL 

TLTLTLTL 

TLTLJIJI 

TLTLTLTL 

1 

/testbench/instr  pass 

/testbench/out  mem 

/testbench/ prog_rd_p 

/testbench/read_p 

/testbench/wnte_p 

/testbench/data_p 

oooooo 

1410103  1410203  I0803FF  |0804FF  {OSOSFF  lOSC 

61F  1410380  I4104FC  MIOSFF  I2166FF  lOBOTFE  lOSOSFF  lOSOFFF  1210AF3  1217785  I— 

0000 

:0003 

rn_j 

n_j  i_ 
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/testbenclVclk_p 

/testbench/ dk.ram.rom 

/testbertch/en_rom 

/testber*ch/en_ram 

1  1  1  1  !  ,  , 

~i  rn 

.  LI  1 

1  LI  1  1  1  1  1 

rn 

m  ~i 

LILI.IL 

/testbenctVreset_p 

/testbencfVstall_p 

/testtiench/instr_pass 

I5AF987  ;i45i)523  1450724  J45C 

325  1450926  llSl 

A20  JlBlBDC  |5B1 

C03  15B1D02  1450A27  ^450628  1450C29  :450D2A  IlCF 

>10  I1C17FC  Jsci 

SOD  I5C1901  1  ^ 

/testbench/out  mem 

[■-'J-iJfFEO  1o003  IFFE 

[•jo:)TfFE0  Io003  IFFEO  3C-CK)3  I.xoi' 

[icxilFFEO 

/testbench/pn3g_rd_p 

1  1  1 

1 

i 

1  1 

1  1 

/testbench/ reacl_p 

/testbenctVwnte_p 

/testbenclVdata_p 

■ 

—  - 

1  _  _ 

L_  _ 

'  '  '  ‘ 

/testbenclVclk_p 

/testbench/ clk_ram_rom 

/testbench/en_rom 

/testbench/en_ram 

mil. 

~i  m 

m  . 

1 

...III 

1  1  1  1  1  1  1  1 

m 

l~l 

1  ,  ,  .  1 

! 

1  1  1  1  1  1  1 

/testbencfVreset_p 

/testbench/ stall_p 

/testt)ench/instr_pass 

— —  1450702  1450, 

503~l4509Ch4~:iOl 

A2O~:iDlBF0~l58: 

C03~[58FD^:l450 

A05~l450B06”T450 

C07~]’450DOS~l533< 

606~:i23720“:2BF 

88A~l233980~l03l 

E30”l450609~r^ 

1 

0003IFFEO  30003  I're-jl.v-jilooo’ 

loooc  Jill  XL  Jixoi  ''XO- 

,'X'vjIo003  loOC 

10003  :'Xci:i»03 

looo:  J  jxi'I'jxi;  Icloj 

['X<x  10003 

/testbench/pn5g_rd_p 

1  J  1 

1 

_  _ 

1  1 

1  1 

/testbench/ read_p 

/testbencfVwritej 

/testbench/data_p 

LJ  L_ 

1  _ _ _  _ 

_J 

1  l_  _ 
1— &■: — 

,  .  .. 

’ 
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f.  Tables  of  Registers  and  Memories  in  Simulation  2 


Instruction  Mem 

00 

30 

450A21 

01 

410103 

31 

450B22 

02 

410203 

32 

1A46F0 

03 

0803FF 

33 

1AF740 

04 

0804FF 

34 

5AF8FF 

05 

0805FF 

35 

5AF987 

06 

08061 F 

36 

450623 

07 

410380 

37 

450724 

08 

4104FC 

38 

450825 

09 

4105FF 

39 

450926 

OA 

2166FF 

3A 

1B1A20 

OB 

0807FE 

3B 

1B1BD0 

OC 

0808FF 

3C 

5B1C03 

OD 

080FFF 

3D 

5B1D02 

OE 

210AF3 

3E 

450A27 

OF 

217785 

3F 

450B28 

10 

210BF4 

40 

450C29 

11 

410907 

41 

450D2A 

12 

410D01 

42 

1CF610 

13 

410E00 

43 

1C17F0 

14 

410C00 

44 

5C180D 

15 

410FE0 

45 

5C1901 

16 

000000 

46 

450601 

17 

000000 

47 

450702 

18 

450100 

48 

450803 

19 

450200 

49 

450904 

1A 

450300 

4A 

1D1A20 

IB 

450400 

4B 

1D1BF0 

1C 

450500 

4C 

581C03 

ID 

450600 

4D 

58FDE1 

IE 

450700 

4E 

450A05 

IF 

450800 

4F 

450B06 

20 

450900 

50 

450C07 

21 

450A00 

51 

450D08 

22 

450B00 

52 

533603 

23 

450C00 

53 

123720 

24 

450D00 

54 

2BF88A 

25 

450E00 

55 

233980 

26 

450F00 

56 

031 E30 

27 

000000 

57 

450609 

28 

000000 

58 

45070A 

29 

000000 

59 

45080B 

2A 

191D30 

5A 

45090C 

2B 

19F9E0 

5B 

450E0D 

2C 

59FAE8 

5C 

000000 

2D 

59FBE0 

5D 

000000 

2E 

450D1F 

5E 

000000 

2F 

450920 

5F 

000000 

Register 

00 

01 

0003 

0003 

02 

0003 

0003 

03 

FF80 

FF80 

04 

FFFC 

FFFC 

05 

FFFF 

FFFF 

06 

1FFF 

FFFE 

07 

FE85 

1FF0 

08 

FFOO 

FF6A 

09 

0007 

FFOO 

10 

00F3 

0000 

11 

00  F4 

0001 

12 

0000 

0001 

13 

0001 

0000 

14 

0000 

0083 

15 

FFEO 

FFEO 

Data  Mem 

00 

01 

0001 

02 

0000 

03 

0001 

04 

0000 

05 

0000 

06 

0001 

07 

0001 

08 

0000 

09 

FFFE 

OA 

1FF0 

OB 

FF6A 

OC 

FFOO 

OD 

0083 

OE 

OF 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

1A 

IB 

1C 

ID 

IE 

IF 

0001 

20 

0000 

21 

0000 

22 

0001 

23 

0001 

24 

0000 

25 

0000 

26 

0001 

27 

0001 

28 

0000 

29 

0001 

2A 

0000 
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g.  Implementation  Table  of  Instruction  Set  3 


Instruction  (pseudo  code) 

Opcode 

Expected  Value 

LW 

Rl^Mem(R0+03) 

410103 

LW 

R2^Mem(R0+04) 

410204 

LW 

R3^Mem(R0+00) 

410300 

LW 

R4<— Mem(R0+06) 

410406 

BNEZ 

R1  ;^0^Prog_Addr<— (05)+ 1  +ext(04) 
Note:  PC=05  and  (05)+l+ext(04)=0A 

CO 1004 

BEQZ 

R3=0^Prog_Addr<— (0A)+ 1 +ext(04) 
Note:  PC=0A  and  (0A)+l+ext(04)=0F 

C13004 

ADDI 

R0+ext(25)^R5 

410525 

J 

(0020)^Prog_Addr 

C80020 

JAL 

(0014)^Prog_Addr ;  (23)^R15 
Note:(23)  is  return  address 

E80014 

ADDI 

R0+ext(8A)^R6 

4 1068 A 

ADDI 

R0+ext(40)^R7 

410740 

ADD 

R1+R2^R8 

011820 

ADD 

R1+R4^R9 

011940 

SW 

R15^Mem(R0+01) 

450F01 

0023 

JALR 

R5^Prog_Addr ;  (1D)^R15 
Noter:(lD)  is  return  address 

685000 

J 

(0030)^Prog_Addr 

C80030 

SW 

R5^Mem(R0+02) 

450502 

0025 

SW 

R6^Mem(R0+03) 

450603 

FF8A 

SW 

R7^Mem(R0+04) 

450704 

0040 

SW 

R8^Mem(R0+05) 

450805 

0007 

SW 

R9^Mem(R0+06) 

450906 

0009 

SW 

R15^Mem(R0+07) 

450F07 

OOID 

JR 

R7^Prog_Addr 

487000 

SW 

R2^Mem(R0+08) 

450208 

0004 
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Simulation  Result  of  Instruction  Set  3 


/testbenclVclk_p 

/testbenclVcik_ram_rom 

/testt)er>ch/en_rom 

/testt)er>ch/en_ram 

/testbencfVreset_p 

/testben^stBll_p 

rLTLTLTL 

“LTLTLTL 

“LTLTLTL 

TLTLTLTL 

TLTLTLTL 

“LTLTLTL 

TLTLTLTL 

“LTLTLTL 

TLTLTLTL 

1 

000000 

1410103  1410204  1410300  4410406  ICO1OO4I000000 IC13004  1410525  i000000ic80020l000000 [E80014  141068A  I-  - 

/testbendi/cxjt  mem 

/testbench/pftjg_rd_p 

/testbench/ read_p 

/testt>encfVwrite_p 

/testbench/<lata_p 

0000 

[0003 
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Tables  of  Registers  and  Memories  in  Simulation  3 


f- 

Q 

t- 

t- 


t- 


1* 

t- 

t- 


C- 

t- 

t- 

C- 

t- 


Instruction  Mem 

00 

0  1 

410103 

02 

410204 

03 

41 0300 

04 

410406 

05 

C  0 1 004 

06 

000000 

07 

000000 

08 

09 

OA 

C 1 3004 

OB 

41 0525 

OC 

000000 

OD 

OE 

OF 

C  80020 

1  0 

000000 

1  1 

000000 

1  2 

1  3 

1  4 

011820 

1  5 

01 1 940 

1  6 

450F01 

1  7 

000000 

1  8 

000000 

1  9 

000000 

1  A 

685000 

1  B 

000000 

1  C 

000000 

1  D 

1  E 

1  F 

20 

E8001 4 

2  1 

41 068A 

22 

41 0740 

23 

24 

25 

C  80030 

26 

000000 

27 

000000 

28 

29 

2A 

30 

450502 

3  1 

450603 

32 

450704 

33 

450805 

34 

450906 

35 

450F07 

36 

487000 

37 

000000 

38 

000000 

39 

40 

450208 

4  1 

000000 

42 

000000 

43 

000000 

Reg  iste  r 

00 

0  1 

0003 

02 

0004 

03 

0000 

04 

0006 

05 

0025 

06 

FF8A 

07 

0040 

08 

0007 

09 

0009 

1  0 

1  1 

1  2 

1  3 

1  4 

1  5 

3—*^  1  5 


D  a  ta  Mem 

00 

01 

0023 

02 

0025 

03 

FF8A 

04 

0040 

05 

0007 

06 

0009 

07 

00  1  D 

08 

0004 

09 

OA 

OB 

OC 

OD 

OE 

OF 

1  0 

1  1 

12 

1  3 

14 

1  5 

1  6 

1  7 

1  8 

1  9 

1  A 

1  B 

1  C 

1  D 

1  E 

1  F 

20 

21 

22 

23 

24 

25 

26 

27 

2  8 

29 

2A 
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j.  Implementation  Table  of  Instruction  Set  4 


Instruction  (operation  symbol) 

Opcode 

Expected  Value 

ADDI 

R0+ext(04)^Rl 

410104 

ADDI 

R0+ext(07)^R2 

410207 

TRAP 

(0020)^Prog_Addr ;  (06)^IAR 
Note;  (06)  is  return  address 

280020 

ADDI 

R0+ext(09)^R3 

410309 

ADDI 

R0+ext(15)^R4 

410415 

ADDI 

R0+ext(0A)^R7 

41070A 

ADDI 

R0+ext(ll)^R8 

410811 

ADDI 

R0+ext(C2)^R10 

410AC2 

RFE 

(06)^Prog  Addr 

Note;  (06)  is  lAR 

F80000 

J 

(001  l)^Prog_Addr 

C80011 

SW 

Rl^Mem(R0+01) 

450101 

0004 

sw 

R2^Mem(R0+02) 

450202 

0007 

SW 

R3^Mem(R0+03) 

450303 

0009 

sw 

R4^Mem(R0+04) 

450404 

0015 

sw 

R7^Mem(R0+07) 

450707 

OOOA 

sw 

R8^Mem(R0+08) 

450808 

0011 

sw 

R10^Mem(R0+0A) 

450A0A 
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k. 


Simulation  Result  of  Instruction  Set  4 
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/testbenclVclk_p 

/testber>ch/dk_ram_rom 

/testbench/en  rom 
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/testt)ench/reset_p 

/testbench/ stall_p 
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1 
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1. 


Tables  of  Registers  and  Memories  in  Simulation  4 
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00 
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02 
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D,  TMR  ASSEMBLY  WITHOUT  MEMORIES 
1,  Schematic 

This  is  the  design  without  the  lateh  at  the  bottom.  Three  KDLX  proeessors  are  at 
the  left  and  the  six  voters  at  the  center.  Signals  such  as  V  ERR,  CID  l,  CID  O,  and  ERR 
are  collected  individually  to  four  buses  at  the  right.  The  read  signal  is  used  to  enable 
buffers  for  data  from  memory.  The  write  signal  is  used  to  enable  buffers  for  data  to 
memory. 
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2. 


Test  Bench 


The  clock  high  and  low  times  are  each  50  ns.  The  input  setup  time  and  output 
valid  delay  times  are  each  10  ns.  Since  there  are  only  two  instructions,  the  test  bench 
looks  simple.  It  loads  data  in  registers  and  stores  back  to  memory  to  check  whether  this 
schematic  works  properly. 
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3.  Simulation  Result 


As  described  in  Chapter  V  this  schematic  without  a  latch  does  not  write  correct 
data  into  the  registers  due  to  a  timing  problem.  This  kind  of  error  disappears  when 
memories  are  connected.  Because  this  appendix  only  displays  the  final  design  of  each 
component,  the  imperfect  simulation  result  is  still  contained  here.  The  TMR  with  a  latch 
is  discussed  in  Chapter  V  so  it  is  not  contained  here  even  though  it  works  perfectly  with¬ 
out  memories. 


145 


/lestbench/clk_p 

/testbench/data  m 

/testbench/instr  a 

/testbench/instr  b 

/testbench/instr  c 

/testbench/reset_p 

/testbench/stalLp 

/testbench/prog_rd_p 

/testbench/ pc_p 

/testbench/read_p 

/testbench/write_p 

/testbench/cid  1 

/testbench/cid  0 

j - 1 

J - 1 

J - 1 

J - 1 

J - ! 

J - 1 

J - 

0000 

0045 

0O4C 

0053 

005A 

0061 

0068 

000000 

440104 

000000 

000000 

440104 

oooooo 

000000 

440104 

oooooo 

1  1 

J  1 

J  1 

J  1 

J  1 

J 

0000 

“toool 

~l0002 

~l0003 

~I0004 

“10005 

1 

J 

Ixxxxxxxxxoooo 

xxxxxoooooooo 

Ixxxxxoooooooo 

Ixxxxxoooooooo 

Ixxxxxoooooooo 

Ixxxxxoooooooo 

Inxxnnmo 

Ixxxxxxxxxoooo 

xxxxxoooooooo 

Ixxxxxoooooooo 

Ixxxxxoooooooo 

Ixxxxxoooooooo 

IXXXXXXXXXQOQQ  ~ 

xxxxxoooooooo 

Ixxxxxoooooooo 

Ixxxxxoooooooo 

Ixxxxxoooooooo 

/testbench/v  err 

/testbench/addr  p 

/testbench/data_p 

Ixxxxxxxxxoooo 

xxxxxoooooooo 

looooooooooodolxxxxxoooooooo 

noool 

oooo 

10004 

loooo 

/testbench/clk_p 
/testbench/ data_m 
/testbench/instr_a 
/testbench/instr_b 
/testbench/instr_c 
/testbench/reset_p 
/testbench/stalLp 
/testbench/prog_rd_p 
/testbench/  pc_p 
/testbench/read_p 
/testbench/write_p 
/testbench/cid_l 
/testbench/cid_0 
/testbench/err 
/testbench/v_err 
/testbench/addr_p 
/testbench/data_p 


n 

J - 1 

J - 1 

J - 1 

J - 1 

J - 1 

0068 

006F 

0076 

007D 

0084 

oooooo 

450102 

oooooo 

oooooo 

450102 

oooooo 

oooooo 

450102 

oooooo 

n 

J  1 

J  1 

J  1 

J  1 

J  1 

0005 

loooe 

l0007 

nfooos 

~y0009 

lOOOA 

1 

J 

xxxxxoooooooo  Ixxxxxoooooooo 

Ixxxxxoooooooo 

1)0000(00000000  looooooooooo 

DO  Ixxxxxoooooooo 

Ixxxxxoooooooo 

xxxxxoooooooo  Ixxxxxoooooooo 

Ixxxxxoooooooo 

IxxnooDOOOOOOO  loooooooooooi 

M  Ixxxxxoooooooo 

Ixxxxxoooooooo 

xxxxxoooooooo  [xxxxxoooooooo 

“Ixxxxxoooooooo 

[xxxxxoooooooo  loooooooooooi 

M Ixxxxxoooooooo 

“Ixxxxxoooooooo 

xxxxxooooot 

00 

loooooooooooi 

kTKxxxxxoooooooo 

0000 

[0002 

“loooo 

. - 

E.  TMR  ASSEMBLY  WITH  MEMORIES 
1,  Schematic 

This  schematic  uses  the  TMR  Assembly  without  a  latch.  The  instruction  memory 
on  the  left  side  sends  one  instruction  to  the  three  processors  at  the  same  time.  Therefore, 
this  schematic  is  used  only  for  checking  basic  functions.  Nothing  related  with  fault  toler¬ 
ant  can  be  tested  here. 
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2, 


Test  Bench 


Since  the  instruction  is  pre-configured  ini?OMand  RAMha.s  default  value  0003  le, 
no  data  needs  to  be  assigned.  The  test  bench  ends  at  2900  ns.  The  eloek  high  and  low 
times  for  both  memories  and  processors  are  eaeh  50  ns.  The  input  setup  time  and  output 
valid  delay  are  10  ns  for  proeessors  and  5  ns  for  memories. 
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3,  Simulation  Result 
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F.  FAULT-TOLERANT  TESTING 
1,  Schematic 

This  simulation  uses  three  ROMs  to  achieve  the  goal  of  inserting  different  instruc¬ 
tions.  This  simulates  the  condition  whenever  three  processors  have  inconsistent  instruc¬ 
tions.  The  TMRA  can  also  be  modified  to  connect  with  three  different  RAMs.  Then  the 
simulation  will  be  more  complex  and  much  more  time  needed  for  analysis.  As  discussed 
in  Chapter  V,  such  errors  should  be  caught  and  corrected  by  the  voters  as  long  as  no  more 
than  one  SEU  occurs  in  a  voter. 
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2. 


Test  Bench 


The  memories  are  pre-eonfigured  so  no  special  settings  are  needed  in  this  test 
bench.  The  simulation  ends  at  3400  ns.  The  clock  high  and  low  times  for  both  memories 
and  processors  are  each  50  ns.  The  input  setup  time  and  output  valid  delay  are  10  ns  for 
processors  and  5  ns  for  memories. 
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3.  Memories  Pre-configuration 


Only  one  instruction  is  different  in  each  address  ofi?OMs.  This  avoids  multiple 
errors  being  sent  to  the  voters  at  the  same  time.  The  RAM  contains  non-repeated  data  in 
each  address.  Details  on  how  to  read  the  error  detection  signal  and  analyze  the  error  are 
discussed  in  Chapter  V. 
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Simulation  Result 
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G. 


RECONCILER 


1,  Schematic 


rec 


2,  Test  Bench 


The  clock  high  and  low  times  are  each  50  ns.  The  input  setup  time  and  output 
valid  delay  are  each  10  ns.  Manually  set  values  in  the  data  address,  the  program  counter 
and  the  data  were  used  to  distinguish  which  one  was  fetched. 
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3,  Simulation  Result 
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RECONCILER  WITH  KDLX  AND  MEMORY 
1,  Schematic 


154 


2, 


Test  bench 


The  clock  high  and  low  times  for  KDLX,  Reconciler,  and  memory  are  50  ns,  25 
ns,  and  25  ns,  respectively.  The  input  setup  times  and  output  valid  delays  for  KDLX, 
Reconciler,  and  memory  are  8  ns,  9  ns,  and  10  ns,  respectively. 
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3,  Simulation  Result 
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I.  INTERRUPT 

1,  Schematic 

The  rfe_i(23:0)  is  used  to  monitor  the  RFE  instruction.  The pc_in(15:0)  is  con¬ 
nected  to  the  program  counter  of  KDLX.  The  signal  sel_i(23:0)  controls  the  muxes  in 
order  to  insert  the  TRAP  and  Jump  instruction  sent  out  from  trap _i(2 3:0). 


interrupt 


2,  Test  Bench 

Random  numbers  are  assigned  to  rfe_i(23:0)  and pc_in(15:0).  An  RFE  instruc¬ 
tion  at  time  900  ns  emulates  the  end  of  the  ISR. 
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3,  Simulation  Result 


/lestbcnciVc*  ■ 

1  1  1 

- 1  i - 1 

-  1 - 1 

1 - 1  - ' 

/testbendi/rfe  ( 

yxtox  ' 

'000035  lOOOOM 

OOOOOf  10000D4 

'oooloo  I00013S 

000173  I'XIA® 

COOIDO  I:00!12 

/tnlbpnth/pf  in 

0000 

0001  10002 

0002  |0004 

OOOS  [0006 

ooc-  I:occ 

OwC*  [ClUUA 

/lesttench/err 

/tpstbpwtvrrsrt.i 

/teMbendVpc  out 

- 1 

1 

p 

/tMtbencIVMl  1 

[rrrrrr 

[oooooo 

/testbench/trap  ( 

/tesctMDCh/statc  i 

- - 4 

(£«DOJO - ) 

tyoGyub - ! 

10 

h 

I2  l3 

I<  l5 

l6  1 

Is  Is 

/trsibeiKh/ck  i 

i - i  i - 1  il - 1  1 - 1 

1  1  1  1  \  1  1 

/lestbench/rfe  i 

/(estbencti/pc  in 

/le«ttMnctV*rr 

/test  brut  IVf«<*t_‘ 

/testbendVpc  out 

ricc2TC  looozei  Iooo2« 

00031B  I00033B 

FSOOOO  I0003U 

OOOMF  I000S24 

OOOiJFC'  1000486 

DOOH  loooc  [OOOC)  IlHJUt- 

OOOF  I0010 

0011  loot? 

ooil  lOOH 

IKIIS  ]oni6 

1 

1 

1 

/uttbftiKh/ut  1 

cuoox  Irrrrrr 

[oooooo 

[rrnrr 

/tettbendVtrap.i 

- 1  I 

, - , 

- -  1 

/tcslt)eiKtVsUi(«  t 

I*  1  lie  lo 

Ic  |o 

It  Ir 

lo  11 

I2  l3 

J.  INTERRUPT  WITH  KDLX  AND  MEMORY 

1,  Schematic 

The  Reconciler  is  not  included  in  this  schematic  so  two  memories  are  used  for  a 
Harvard  architecture.  In  this  design,  the  Interrupt  only  needs  to  monitor  the  instructions 
from  the  ROM.  The  error  signal  is  triggered  manually  in  the  test  bench.  Once  the  ISR 
starts,  the  instruction  on  the  bus  will  be  replaced  with  the  TRAP  instruction  and  lead  the 
KDLX  to  implement  the  specific  ISR.  The  last  instruction  in  the  ISR  is  the  RLE  instruc¬ 
tion  which  activates  the  Interrupt  to  insert  a  new  Jump  instruction  into  KDLX.  Then  the 
circuit  goes  back  to  its  normal  operation. 
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2. 


Test  Bench 


The  KDLX  clock  high  and  low  times  are  each  50  ns.  The  input  setup  time  and 
output  valid  delay  are  each  10  ns.  The  Interrupt,  ROM  and  RAM  all  run  in  double  speed 
with  a  clock  high  and  low  time  of  25  ns.  The  setup  time  and  hold  times  are  each  3  ns. 
Generate  an  error  in  the  test  bench  at  time  900  ns  to  check  the  function  of  the  state  ma¬ 
chine.  This  test  bench  stops  at  time  4900  ns. 
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3,  Memory  Pre-configuration  and  Results 

The  highlighted  Opeode  is  where  an  error  occurs  in  the  test  bench.  Contents  in 
the  Instruction  Mem  and  the  upper  half  data  of  the  Data  Mem  are  pre-configured.  Regis¬ 
ters  and  the  lower  half  data  of  the  Data  Mem  are  the  final  values  after  the  simulation  is 
done. 
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Simulation  Result 
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K,  THE  FULL  DESIGN  WITHOUT  ESSD 
1,  Schematic 

Three  RAMs  are  used  to  provide  inconsistent  data  to  TMRA.  This  schematic  is 
designed  for  simulating  the  circumstance  at  the  occurrence  of  an  error.  The  real  design 
needs  only  one  RAM  and  does  not  have  to  triplicate  the  instruction  and  data  buses. 
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Interru 
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The  clock  high  and  low  times  for  KDLX,  Reconciler,  Interrupt,  and  memory  are 
50  ns,  25  ns,  25  ns,  and  25  ns,  respectively.  The  input  setup  times  and  output  valid  de¬ 
lays  for  KDLX,  Reconciler,  Interrupt,  and  memory  are  8  ns,  9  ns,  9  ns,  and  10  ns,  respec¬ 
tively.  The  ending  point  of  this  test  bench  is  at  4900  ns. 


The  signals  between  clk_i  and  clkjn  are  associated  with  the  Interrupt  clock  cycle. 
The  signals  between  clkjn  and  elk j)  are  associated  with  the  memory  clock  cycle.  Each 


3,  Memory  Pre-configurations 
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Simulation  Result 
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5.  Zoom-in  Figures  of  cid_l  and  cidjO 


. 

'0033 

>0034  '0035 

J0036  '0037 

>0038  ; 

- 

- 

'004A 

'0046  '004C 

"'0'04D  '004E 

'004F  ; 

- 

1  )  , 

t  >  1 . 

1 .  )  : 

1  )  . 

i  )  : 

L.  THE  FULL  DESIGN  WITH  ESSD 
1,  Schematic 

The  ESSD  intercepts  all  connections  on  RAMs  when  the  error  syndromes  are  be¬ 
ing  stored.  The  clock  for  Interrupt  and  ReconcUer  are  wired  together  since  they  work  in 
parallel. 
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2.  Test  Bench 

The  clock  high  and  low  times  for  KDLX,  latchSl,  Reconciler  (or  Interrupt), 
ESSD,  and  memory  are  50  ns,  25  ns,  25  ns,  25  ns,  and  25  ns,  respectively.  The  input 
setup  times  and  output  valid  delays  for  KDLX,  latchSl,  Reconciler  (or  Interrupt),  ESSD, 
and  memory  are  8  ns,  8  ns,  9  ns,  9  ns,  and  10  ns,  respectively.  The  test  bench  ends  at 
time  4900  ns. 


3 


Simulation  Result 


/testbencti/dk_p 
/testbench/clk_l 
/testbench/clk_ir 
/testbench/clk_s 
/testberKh/clk_m 
/testbench/reset_p 
/testtiench/pc_in 
/testt)ench/addr_in 
/testbench/addr_s 
/testbench/cidl_in 
/testbench/cidO_in 
/testbench/ dout_fna 
/testbench/do  iJt_mb 
/testbench/dout_mc 
/testbench/err 
/testbench/stalLs 
/testbench/wr_s 
/testbench/seLwr 
/testbench/ ess 
/testbench/prog_p 
/testbench/seLaddr 
/testbench/seLs 
/testbench/state_l 
/testbench/state_r 
/testbench/state_s 
/testt>ench/trap_i 
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/testbench/dk_p 

/testbench/clk_l 

/tesCbench/clk_ir 

/testt)ench/clk_s 

/testbench/cJk_m 

/testbench/reset_p 

/testberKh/pc_in 

/testbench/addr  in 

/testbench/addr_s 

/testbench/cidl  in 

/lestt>ench/cidO_in 

/testbench/dout  ma 

/testbench/dout  mb 

/testbench/ dout_mc 
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{440859  {OOOOOO 

J_ 1 

1  1  [  1 

i_ 1 

J_ 1 

/testbench/stalLs 

/testt)ench/wr_s 

/testbench/sel_wr 

/testbench/ess 

/testbench/ prog_p 

/testbench/sel  addr 

/testbench/sel  s 
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/testbench/state  r 

/testbench/state  s 
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/testbench/pc_out 
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APPENDIX  B:  KDLX  INSTRUCTION  SET  DESCRIPTION 


This  appendix  lists  all  of  the  operation  eodes  and  funetions  of  the  instruetions 
used  in  the  KDLX.  This  referenee  was  originally  contained  in  Dr.  Kenneth  Clark’s  dis¬ 
sertation  [8].  Some  errors  were  found  and  have  been  checked  with  the  author.  The  func¬ 
tion  of  the  correct  operation  codes  has  been  proved  in  the  simulations  of  this  thesis.  The 
operation  description  is  revised  in  order  to  give  a  clear  discription  of  how  data  transfers. 

Some  symbols  used  in  this  appendix  need  to  be  introduced  first.  Rsl  represents 
one  of  the  15  registers  in  KDLX.  Rs2  represents  one  of  the  15  registers  in  KDLX  as 
well.  Rsl  and  Rs2  could  be  the  same  register.  Rd  represents  one  of  the  15  registers  in 
KDLX  used  as  a  destination  register.  Immed?  represents  the  most  significant  bit  of  a  7- 

o 

bit  immediate  value.  [(Immedy)  ||  Immed]  represents  an  7-bit  immediate  value  being 
sign  extended  to  16-bit  long. 


Instruction:  ADD  (Register  Add) 


23  20  19  16 

15  12 

11  8 

7  4 

3  0 

Opcode:  0x0 1 

Rsl 

Rd 

Rs2 

Unused 

Usage:  ADD  Rd,  Rsl,  Rs2 


Operation:  Rd  <—  (Rsl+Rs2) 


Instruction:  ADDI  (Add  Immediate) 


23  20  19  16 

15  12 

11  8 

7  4 

3  0 

Opcode:  0x0 1 

Rsl 

Rd 

Rs2 

Unused 

Usage:  ADDI  Rd,  Rsl,  Immed 


o 

Operation:  Rd  <—  (RsI+[(Immed7)  ||  Immed]) 


Instruction:  ADDUI  (Add  Unsigned  Immediate) 


23 _ 20  19 

Opcode:  0x21 


16 

15  12 

11  8 

7 

4  3 

Rsl 

Rd 

Immed 

0 


Usage:  ADDUI  Rd,  Rsl,  Immed 


Operation:  Rd  <—  (RsI+[(0)^  ||  Immed]) 
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Instruction:  AND  (Register  AND) 


23  20  19  16 

15  12 

11  8 

7  4 

3  0 

Opeode:  0x09 

Rsl 

Rd 

Rs2 

Unused 

Usage:  AND  Rd,  Rsl,  Rs2 
Operation:  Rd  <—  (Rsl  (logieal-and)  Rs2) 


Instruetion:  ANDI  (AND  Immediate) 


23 

20  19 

16  15 

12  11  87 

4  3 

0 

Opeode:  0x29 

Rsl 

Rd 

Immed 

Usage:  AND  Rd,  Rsl,  Immed 

o 

Operation:  Rd  <—  (Rsl  (logieal-and)  [(Immedy)  ||  Immed]) 


Instruetion:  BEQZ  (Braneh  if  Equal  to  Zero) 


23  20  19  16 

15  12 

11  8 

7  4  3  0 

Opeode:  OxCl 

Rsl 

Unused 

Immed 

Usage:  BEQZ  Rsl,  Immed 

o 

Operation:  If  Rsl=0,  then  Program  Address  <—  (PC+1+ [(Immed?)  ||  Immed]) 


Instruetion:  BNEZ  (Braneh  if  Not  Equal  to  Zero) 


23 

20  19 

16  15 

12  11  87 

4  3 

0 

Opeode:  OxCO 

Rsl 

Unused 

Immed 

Usage:  BNEZ  Rsl,  Immed 

o 

Operation:  If  RsIt^^O,  then  Program  Address  <—  (PC+l+[(Immed7)  ||  Immed]) 


Instruetion:  J  (Jump) 


23  20  19  16 

15  12  11  8  7  4  3  0 

Opeode:  0xC8 

Immed 

Usage:  J  Immed 


Operation:  Program  Address  <—  Immed 
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Instruction:  JAL  (Jump  and  Link) 


23  20  19  16 

15  12  11  8  7  4  3  0 

Opeode:  0xE8 

Immed 

Usage:  JAL  Immed 


Operation:  Program  Addr  <—  Immed; 

RI5  <—  Link  Program  Address 


Instruetion:  JALR  (Jump  Register  and  Link) 


23  20  19  16 

15  12 

11  8  7  4  3  0 

Opeode:  0x68 

Rsl 

Unused 

Usage:  JALR  Rsl 

Operation:  Program  Addr  <—  (Rsl); 

RI5  <—  Link  Program  Address 


Instruetion:  JR  (Jump  Register) 


23  20  19  16 

15  12 

11  8  7  4  3  0 

Opeode:  0x48 

Rsl 

Unused 

Usage:  JALR  Rsl 

Operation:  Program  Address  <—  (Rsl) 


Instruetion:  LHI  (Load  High  Immediate) 


23  20  19  16 

15  12 

11  8 

7  4  3  0 

Opeode:  0x08 

Unused 

Rd 

Immed 

Usage:  LHI  Rd,  Immed 


o 

Operation:  Rd  <—  Immed  ||  (0) 


Instruetion:  LW  (Load  Word) 


23 

20  19 

16  15 

12  11  87 

4  3 

0 

Opeode:  0x44 

Rsl 

Rd 

Immed 

Usage:  LW  Rd,  Rsl  (Immed) 

o 

Operation:  Rd  <—  Mem{RsI+[(Immed7)  ||  Immed]} 
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Instruction:  NOP  (No  Operation) 


23  20  19  16 

15  12  11  8  7  4  3  0 

Opcode:  0x00 

Unused 

Usage:  NOP 
Operation:  None 


Instruetion:  OR  (Register  OR) 


23  20  19  16 

15  12 

11  8 

7  4 

3  0 

Opcode:  0x2A 

Rsl 

Rd 

Rs2 

Unused 

Usage:  OR  Rd,  Rsl,  Rs2 

Operation:  Rd  <—  (Rsl  (logical-or)  Rs2) 


Instruetion:  ORI  (OR  Immediate) 


23 

20  19 

16  15 

12  11  87 

4  3 

0 

Opcode:  0x2A 

Rsl 

Rd 

Immed 

Usage:  ORI  Rd,  Rsl,  Immed 
Operation:  Rd  <—  (Rsl  (logical-or)  Immed) 

Instruetion:  RFE  (Return  from  Exception) _ 


23  20  19  16 

15  12  11  8  7  4  3  0 

Opcode:  0xE8 

Unused 

Usage:  RFE 

Operation:  Program  Address  <—  Interrupt  Address  Register 


Instruction:  SEQ  (Set  if  Equal) 


23  20  19  16 

15  12 

11  8 

7  4 

3  0 

Opcode:  0x18 

Rsl 

Rd 

Rs2 

Unused 

Usage:  SEQ  Rd,  Rsl,  Rs2 

Operation:  If  Rsl=Rs2,  then  Rd=0x0001  else  Rd=0x0000 
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Instruction:  SEQI  (Set  Equal  Immediate) 


23  20  19  16 

15  12 

11  8 

7  4  3  0 

Opcode:  0x58 

Rsl 

Rd 

Immed 

Eisage:  SEQI  Rd,  Rsl,  Immed 

Operation:  If  Rsl=[(Immed7)^  ||  Immed],  then  Rd=0x0001  else  Rd=0x0000 
Instruetion:  SGE  (Set  if  Greater  Than  or  Equal) _ _ _ 


23  20  19  16 

15  12 

11  8 

7  4 

3  0 

Opcode:  0x19 

Rsl 

Rd 

Rs2 

Unused 

Usage:  SGE  Rd,  Rsl,  Rs2 

Operation:  If  Rsl  >  Rs2,  then  Rd=0x0001  else  Rd=0x0000 

Instruetion:  SGEI  (Set  if  Greater  Than  or  Equal  Immediate) _ 


23  20  19  16 

15  12 

11  8 

7  4  3  0 

Opcode:  0x59 

Rsl 

Rd 

Immed 

Usage:  SGEI  Rd,  Rsl,  Immed 

Operation:  If  Rsl  >  [(Immed?)^  ||  Immed],  then  Rd=0x0001  else  Rd=0x0000 


Instruetion:  SGT  (Set  if  Greater  Than) 


23  20  19  16 

15  12 

11  8 

7  4 

3  0 

Opcode:  OxlA 

Rsl 

Rd 

Rs2 

Unused 

Usage:  SGT  Rd,  Rsl,  Rs2 

Operation:  If  Rsl>Rs2,  then  Rd=0x0001  else  Rd=0x0000 

Instruetion:  SGTI  (Set  if  Greater  Than  Immediate) _ 


23 

20  19 

16  15 

12  11  87 

4  3 

0 

Opcode:  0x5A 

Rsl 

Rd 

Immed 

Usage:  SGTI  Rd,  Rsl,  Immed 

Operation:  If  Rsl>[(Immed7)^  ||  Immed],  then  Rd=0x0001  else  Rd=0x0000 
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Instruetion:  SEE  (Set  if  Eess  " 

rhan  or  Equal) 

23  20  19  16 

15  12 

11  8 

7  4 

3  0 

Opeode;  OxlB 

Rsl 

Rd 

Rs2 

Unused 

Usage;  SLE  Rd,  Rsl,  Rs2 


Operation:  If  Rsl  <  Rs2,  then  Rd=0x0001  else  Rd=0x0000 


Instruetion:  SEEI  (Set  if  Eess  Than  or  Equal  ] 

immediate) 

23  20  19  16 

15  12 

11  8 

7 

4  3 

_ oj 

Opeode;  0x5B 

Rsl 

Rd 

Immed 

Usage;  SLEI  Rd,  Rsl,  Immed 


Operation:  If  Rsl  <  [(Immed?)^  ||  Immed],  then  Rd=0x0001  else  Rd=0x0000 


Instruetion:  SEE  (Shift  Eogie  Eeft) 


23  20  19  16 

15  12 

11  8 

7  4 

3  0 

Opeode;  0x11 

Rsl 

Rd 

Rs2 

Unused 

Usage:  SEE  Rd,  Rsl,  Rs2 


Operation:  Rd  <—  (Rsl)  shifted  left  by  Rs2(3:0)  bits 


Instruetion:  SEEI  (Shift  Eogie  Eeft  Immediate) 


23 


20  19 

Opeode;  0x51 


16 

15  12 

11  8 

7 

4  3 

Rsl 

Rd 

Immed 

0 


Usage;  SEEI  Rd,  Rsl,  Immed 


Operation:  Rd  <—  (Rsl)  shifted  left  by  Immed(3:0)  bits 


Instruetion:  SET  (Set  if  Eess  " 

rhan) 

23  20  19  16 

15  12 

11  8 

7  4 

3  0 

Opeode;  OxlC 

Rsl 

Rd 

Rs2 

Unused 

Usage;  SET  Rd,  Rsl,  Rs2 


Operation:  If  Rsl<Rs2,  then  Rd=0x0001  else  Rd=0x0000 


178 


Instruction:  SLTI  (Set  if  Less  Than  Immediate) 


23 

20  19 

16  15 

12  11  87 

4  3 

0 

Opcode:  0x5C 

Rsl 

Rd 

Immed 

Usage:  SLTI  Rd,  Rsl,  Immed 

Operation:  If  RsI<[(Immed7)^  ||  Immed],  then  Rd=0x000I  else  Rd=0x0000 


Instruetion:  SNE  (Set  if  Not  Equal) 


23  20  19  16 

15  12 

11  8 

7  4 

3  0 

Opcode:  OxlD 

Rsl 

Rd 

Rs2 

Unused 

Usage:  SNE  Rd,  Rsl,  Rs2 

Operation:  If  RsIt^^RsI,  then  Rd=0x0001  else  Rd=0x0000 

Instruetion:  SNEI  (Set  if  Not  Equal  Immediate) _ 


23  20  19  16 

15  12 

11  8 

7  4  3  0 

Opcode:  0x58 

Rsl 

Rd 

Immed 

Usage:  SNEI  Rd,  Rsl,  Immed 

Operation:  If  Rsl7^;[(Immed7)^  ||  Immed],  then  Rd=0x0001  else  Rd=0x0000 
Instruetion:  SRA  (Shift  Right  Arithmetic)  _ _ _ 


23  20  19  16 

15  12 

11  8 

7  4 

3  0 

Opcode:  0x13 

Rsl 

Rd 

Rs2 

Unused 

Usage:  SRA  Rd,  Rsl,  Rs2 

Operation:  Rd  <—  (Rsl)  shifted  by  Rs2(3:0)  bits,  with  Rsl(15)  shifted  in  from 
right  (for  sign  extension) 

Instruction:  SRAI  (Shift  Right  Arithmetic  Immediate) _ _ 


23  20  19  16 

15  12 

11  8 

7  4  3  0 

Opcode:  0x53 

Rsl 

Rd 

Immed 

Usage:  SRAI  Rd,  Rsl,  Immed 

Operation:  Rd  <—  (Rsl)  shifted  by  Immed(3:0)  bits,  with  Rsl(15)  shifted  in  from 
right  (for  sign  extension) 
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Instruction:  SRL  (Shift  Right  Logical) 


23  20  19  16 

15  12 

11  8 

7  4 

3  0 

Opeode:  0x12 

Rsl 

Rd 

Rs2 

Unused 

Usage:  SRL  Rd,  Rsl,  Rs2 

Operation:  Rd  <—  (Rsl)  shifted  by  Rs2(3:0)  bits,  with  O’s  shifted  in  from  right 


Instruetion:  SRLI  (Shift  Right  Logieal  Immediate) 


23  20  19  16 

15  12 

11  8 

7  4  3  0 

Opeode:  0x52 

Rsl 

Rd 

Immed 

Usage:  SRLI  Rd,  Rsl,  Immed 


Operation:  Rd  <—  (Rsl)  shifted  by  Immed(3:0)  bits,  with  O’s  shifted  in  from  right 


Instruetion:  SUB  (Register  Subtraet) 


23  20  19  16 

15  12 

11  8 

7  4 

3  0 

Opeode:  0x03 

Rsl 

Rd 

Rs2 

Unused 

Usage:  SUB  Rd,  Rsl,  Rs2 
Operation:  Rd  <—  (Rsl-Rs2) 

Instruetion:  SUBI  (Subtraet  Immediate) 


23 

20  19 

16  15 

12  11  87 

4  3 

0 

Opeode:  0x43 

Rsl 

Rd 

Immed 

Usage:  SUB  Rd,  Rsl,  Immed 


o 

Operation:  Rd  <—  (Rsl -[(Immed?)  ||  Immed]) 


Instruetion:  SUBUI  (Subtraet  Unsigned  Immediate) 


23  20  19  16 

15  12 

11  8 

7  4  3  0 

Opeode:  0x23 

Rsl 

Rd 

Immed 

Usage:  SUBUI  Rd,  Rsl,  Immed 


Operation:  Rd  <—  (Rsl-[(0)^  ||  Immed]) 
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Instruction:  SW  (Store  Word) 


23 

20  19 

16  15 

12  11  87 

4  3 

0 

Opeode:  0x45 

Rsl 

Rd 

Immed 

Usage:  SW  Rs2,  Rsl(Immed) 

Operation:  Mem{Rsl+[(Immed7)^  ||  Immed]}  <—  Rs2 


Instruetion:  TRAP  (Software  Trap) 


23  20  19  16 

15  12  11  8  7  4  3  0 

Opeode:  0x28 

Unused 

Usage:  Trap  Immed 


Operation:  Program  Address  <—  Immed; 

Interrupt_Address_Register  <—  Link_Program_Address 


Instruetion:  XOR  (Register  Exelusive-OR) 


23  20  19  16 

15  12 

11  8 

7  4 

3  0 

Opeode:  OxOB 

Rsl 

Rd 

Rs2 

Unused 

Usage:  XOR  Rd,  Rsl,  Rs2 

Operation:  Rd  <—  (Rsl  (exelusive-or)  Rs2) 


23  20  19  16 

15  12 

11  8 

7  4  3  0 

Opeode:  0x2B 

Rsl 

Rd 

Immed 

Usage:  XORI  Rd,  Rsl,  Immed 


Operation:  Rd  <—  (Rsl  (exelusive-or)  Immed) 
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APPENDIX  C :  VHDL  CODE 


RECONCILER 


'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k 

Module:  Reconciler 

Function:  The  Reconciler  is  used  as  an  interface  between  the  KDLX 
and  memory.  It  runs  two  times  faster  than  the  KDLX. 

Author:  Rong  Yuan,  TWAF 

Date:  Nov  14,  2003 

'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k'k 


library  IEEE; 

use  IEEE. STD_LOGIC_1164. ALL; 
use  IEEE. STD_LOGIC_ARITH. ALL; 
use  IEEE. STD  LOGIC  UNSIGNED .ALL; 


entity  rec  is  Port  ( 

elk  r:  in  std_logic; 
reset_r:  in  std_logic; 
rd  r:  in  std  logic; 
wr  r:  in  std  logic; 

addrin_r:  in  std  logic_vector ( 15  downto  0); 
pc_r:  in  std_logic_vector ( 15  downto  0); 
datain  r:  in  std  logic  vector(23  downto  0); 
addrout_r:  out  std_logic_vector ( 15  downto  0); 
instr_data:  out  std_logic_vector (23  downto  0); 
dataout_r:  out  std_logic_vector (23  downto  0); 
mem  data:  inout  std  logic_vector ( 15  downto  0); 
wrout_r:  out  std_logic; 

state_r:  out  std_logic_vector ( 3  downto  0) 

)  ; 

end  rec; 

architecture  fsm  of  rec  is  --  fsm  is  Finite  State  Machine 

type  targetFSM  is  (State,  StateO,  Statel,  ReadState,  WriteState) ; 
signal  currState,  nextState:  targetFSM; 


begin 

nxtStProc:  process  (  currState,  rd_r,  wr_r) 
begin 
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case  currState  is 
when  State  => 

nextState  <=  StateO; 
when  StateO  => 

if  (rd  r= ' 0 '  and  wr  r= ' 1 ' )  then  --  read  from  memory 
nextState  <=  ReadState; 

elsif  (rd  r= ' 1 '  and  wr  r= ' 0 ' )  then  --  write  to  memory 
nextState  <=  WriteState; 

else 

nextState  <=  Statel; 
end  if; 

when  Statel  => 

nextState  <=  StateO; 
when  ReadState  => 

nextState  <=  StateO; 
when  WriteState  => 

nextState  <=  StateO; 

end  case; 

end  process  nxtStProc; 


--  Process  to  register  the  current  state 
curStProc:  process  (clk_r,  reset_r) 
begin 

if  (reset  r  ='0')  then 
currState  <=  State; 

elsif  (elk  r ' event  and  elk  r= ' 1 ' )  then 
currState  <=  nextState; 
end  if; 

end  process  curStProc; 


--  Process  to  generate  outputs 

outConProc:  process  (currState,  wr_r,  pc_r,  datain_r,  addrin_r, 
mem_data) 

begin 


case  currState  is 
when  State  => 
null ; 

starts  at  Statel  after  reset 


--  generated  for  reset  only 
--  without  this  state,  state  machine 


to  KDLX 


when  StateO  =>  --  doing  instruction  fetch 

state_r  <=  "0000"; 
wrout  r  <=  wr  r; 

addrout  r  <=  pc  r;  --  sending  pc  to  memory 

instr  data  <=  datain  r;  --  memory  sends  instruction 


dataout_r  <=  (others  =>  'Z'); 
mem  data  <=  (others  =>  'Z'); 
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when  Statel  =>  --  exactly  the  same  as  StateO 

--  for  keeping  current  state 

state_r  <=  "0001"; 
wrout  r  <=  wr  r; 
addrout_r  <=  pc_r; 
instr  data  <=  datain  r; 
dataout_r  <=  (others  =>  'Z'); 
mem  data  <=  (others  =>  ' Z ' ) ; 

when  ReadState  =>  --  When  KDLX  reads  data  from  memory 

state_r  <=  "0010"; 

wrout  r  <=  wr  r;  --  write  signal  is  one 

addrout  r  <=  addrin  r;  --  sending  address  to  memory 

mem  data  <=  datain_r(15  downto  0); 

--  memory  sends  data  to  KDLX 
dataout  r  <=  (others  =>  'Z');  --  block  input  to  memory 

when  WriteState  =>  --  When  KDLX  writes  data  to  memory 

state_r  <=  "0011"; 

wrout  r  <=  wr  r;  --  write  signal  is  zero 

addrout  r  <=  addrin  r;  --  sending  address  to  memory 

dataout_r(15  downto  0)  <=  mem_data; 

--  KDLX  sends  data  to  memory 
dataout_r(23  downto  16)  <=  "00000000"; 

--  sign  extension  data 


end  case; 

end  process  outConProc; 
end  fsm; 
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B, 


INTERRUPT 


_ -k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k 

--  Module:  Interrupt 

--  Function:  The  Interrupt  is  used  to  switch  to  ISR  when  err  occurs. 

--  It  runs  in  double  speed  and  has  the  same  time  constraints  with 
--  Reconciler.  TRAP  to  other  instruction  set  and  jump  back  when  done. 

--  Notation:  This  Interrupt  is  revised  to  work  with  TMRA  in  this  design 
--  only.  This  is  the  final  version  before  ESSD  is  generated.  Only  two 
--  NOPs  after  TRAP. 

--  Author:  Rong  Yuan,  TWAF 

--  Date:  Nov  17,  2003 

_ kkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk 


library  IEEE; 

use  IEEE. STD_L0GIC_1I64. ALL; 
use  IEEE. STD_LOGIC_ARITH. ALL; 
use  IEEE. STD  LOGIC  UNSIGNED .ALL; 


entity  Interrupt  is  Port  ( 

rfe  i:  in  std_logic  vector (23  downto  0); 
pc  in:  in  std  logic  vector(15  downto  0); 
err:  in  std  logic; 
reset_i:  in  std_logic; 
elk  i:  in  std_logic; 

pc_out :  out  std_logic_vector ( 15  downto  0); 
sel_i:  out  std_logic_vector (23  downto  0); 
trap_i :  out  std_logic_vector (23  downto  0); 
state_i:  out  std_logic_vector ( 3  downto  0) 

)  ; 

end  Interrupt; 

architecture  fsm  of  Interrupt  is 

type  targetFSM  is  (State,  State0_A,  State0_B,  TrapState_A,  TrapState_B, 

NopState0_A,  NopState0_B,  NopStatel_A,  NopStatel_B, 
WaitState_A,  WaitState_B,  BackState_A,  BackState_B) ; 

signal  pc  latch:  std_logic  vector(15  downto  0); 
signal  new  instr:  std  logic  vector(23  downto  0); 
signal  currState,  nextState:  targetFSM; 

begin 

nxtStProc:  process  (  currState,  err,  rfe_i) 
begin 
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case  currState  is 


when  State  => 

nextState  <=  StateO_A; 

when  StateO  A  => 

nextState  <=  StateO_B; 

when  StateO  B  => 
if  (err='l')  then 

nextState  <=  TrapState_A; 

else 

nextState  <=  StateO_A; 
end  if; 

when  TrapState  A  => 

nextState  <=  TrapState_B; 

when  TrapState  B  => 

nextState  <=  NopStateO_A; 

when  NopStateO_A  => 

nextState  <=  NopStateO_B; 

when  NopStateO  B  => 

nextState  <=  NopStatel  A; 

when  NopStatel  A  => 

nextState  <=  NopStatel  B; 

when  NopStatel  B  => 

nextState  <=  WaitState_A; 

when  WaitState  A  => 

nextState  <=  WaitState  B; 


then  --  check  F80000 


--  stay  if  not  seeing  F80000 


when  BackState  A  => 

nextState  <=  BackState  B; 

when  BackState  B  => 

nextState  <=  StateO_A; 

end  case; 

end  process  nxtStProc; 

--  Process  to  register  the  current  state 
curStProc:  process  (clk_i,  reset_i) 
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when  WaitState  B  => 

if  (rfe  i (23  downto  16) ="11111000") 
nextState  <=  BackState  A; 

else 

nextState  <=  WaitState_A; 
end  if; 


begin 


if  (reset  i  ='0')  then 
currState  <=  State; 

elsif  (elk  i ' event  and  elk  i='l')  then 
eurrState  <=  nextState; 
end  if; 

end  proeess  eurStProe; 

--  Proeess  to  generate  outputs 
outConProe:  proeess  (eurrState,  pe_in) 
begin 

ease  eurrState  is 
when  State  => 
null ; 

when  StateO  A  => 

state_i  <=  "0000"; 

trap_i  <=  (others  => ' Z ' ) ; 

sel_i  <=  "111111111111111111111111"; 

pe_out  <=  (others  =>  'Z'); 

when  StateO  B  => 

state_i  <=  "0001"; 

trap_i  <=  (others  => ' Z ' ) ; 

sel  i  <=  "111111111111111111111111"; 

pe_out  <=  (others  =>  'Z'); 

when  TrapState  A  => 

state_i  <=  "0010"; 

sel  i  <=  "000000000000000000000000";  --allow  TRAP  pass  to  KDLX 
trap_i  <=  "001010000000000000110000";  --TRAP  instr  2800030 
pe  lateh  <=  pe  in;  --lateh  pe  for  new  instruetion 

when  TrapState  B  => 

state_i  <=  "0011"; 

sel_i  <=  "000000000000000000000000"; 

pe  out  <=  pe_lateh;  --show  latehed  pe  on  bus 

when  NopStateO  A  => 

state_i  <=  "0100"; 

trap_i  <=  "000000000000000000000000";  --allow  NOP  to  KDLX 

sel_i  <=  "000000000000000000000000"; 
pe_out  <=  (others  =>  'Z'); 

when  NopStateO  B  => 

state_i  <=  "0101"; 

sel_i  <=  "000000000000000000000000";  — allow  NOP  to  KDLX 

pe_out  <=  (others  =>  'Z'); 

when  NopStatel  A  => 

state  i  <=  "0110"; 
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trap_i  <=  "000000000000000000000000"; 
sel_i  <=  "000000000000000000000000"; 
pc_out  <=  (others  =>  'Z'); 

--construct  new  JUMP  instr 

new  instr (23  downto  16)  <=  "11001000"; 

new  instr (15  downto  0)  <=  pc  latch;  --JUMP  is  C8+pc 

when  NopStatel  B  => 

state_i  <=  "0111"; 

sel_i  <=  "000000000000000000000000"; 
pc_out  <=  (others  =>  'Z'); 

when  WaitState  A  => 

state_i  <=  "1000"; 

trap_i  <=  (others  =>  'Z'); 

sel_i  <=  "111111111111111111111111"; 

pc_out  <=  (others  =>  'Z'); 

when  WaitState  B  => 

state_i  <=  "1001"; 

trap_i  <=  (others  =>  'Z'); 

sel  i  <=  "111111111111111111111111"; 

pc_out  <=  (others  =>  'Z'); 

when  BackState  A  => 

state_i  <=  "1010"; 

trap_i  <=  new  instr;  --allow  new  JUMP  to  KDLX 

sel_i  <=  "000000000000000000000000"; 
pc_out  <=  (others  =>  'Z'); 

when  BackState  B  => 

state_i  <=  "1011"; 

sel_i  <=  "000000000000000000000000"; 
pc_out  <=  (others  =>  'Z'); 

end  case; 

end  process  outConProc; 
end  fsm; 


189 


c. 


RECONCILER  FOR  THE  FULL  DESIGN 


_ -k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k 

--  Module:  Reconciler 

--  Function:  The  Reconciler  is  used  as  an  interface  between  TMRA  and 
--  memory.  It  runs  in  double  speed.  Act  as  instruction  memory  in  the 
--  first  half  KDLX  clock  and  as  data  memory  in  the  second  half  KDLX 
--  clock. 

--  Notation:  This  Reconciler  is  revised  to  work  with  the  TMRA  in  this 
--  design  only.  Data  buses  are  triplicated. 

--  Author:  Rong  Yuan,  TWAF 

--  Date:  Nov  14,  2003 

_ kkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk 


library  IEEE; 

use  IEEE. STD_LOGIC_1164. ALL; 
use  IEEE. STD_LOGIC_ARITH. ALL; 
use  IEEE. STD  LOGIC  UNSIGNED .ALL; 


entity  rec2  is  Port  ( 

elk  r:  in  std_logic; 
reset_r:  in  std_logic; 
rd  r:  in  std  logic; 
wr  r:  in  std  logic; 

addrin  r:  in  std  logic  vector(15  downto  0); 
pc_r:  in  std_logic_vector ( 15  downto  0); 
datain  a:  in  std  logic  vector(23  downto  0); 

datain  b:  in  std  logic  vector(23  downto  0); 

datain  c:  in  std  logic  vector(23  downto  0); 

addrout_r:  out  std_logic_vector ( 15  downto  0); 
instr_data_a :  out  std_logic_vector (23  downto  0); 
instr_data_b :  out  std_logic_vector (23  downto  0); 
instr_data_c :  out  std_logic_vector (23  downto  0); 
dataout_r:  out  std_logic_vector (23  downto  0); 
mem_data_a:  out  std_logic_vector ( 15  downto  0); 

--  data  from  mem  to  KDLX 
mem_data_b:  out  std_logic_vector ( 15  downto  0); 
mem_data_c:  out  std_logic_vector ( 15  downto  0); 
mem  data  wr:  in  std  logic  vector(15  downto  0); 

--  data  from  KDLX  to  mem 

wrout_r:  out  std_logic; 

state_r:  out  std_logic_vector ( 3  downto  0) 

)  ; 

end  rec2; 

architecture  fsm  of  rec2  is  --  fsm  is  Finite  State  Machine 
type  targetFSM  is  (State,  StateO,  Statel,  ReadState,  WriteState) ; 
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signal  currState,  nextState:  targetFSM; 
begin 

nxtStProc:  process  (  currState,  rd_r,  wr_r) 
begin 

case  currState  is 

when  State  => 

nextState  <=  StateO; 

read  from  memory 
write  to  memory 

when  Statel  => 

nextState  <=  StateO; 

when  ReadState  => 

nextState  <=  StateO; 

when  WriteState  => 

nextState  <=  StateO; 


when  StateO  => 

if  (rd  r= ' 0 '  and  wr  r= ' 1 ' )  then 
nextState  <=  ReadState; 
elsif  (rd  r= ' 1 '  and  wr  r= ' 0 ' )  then 
nextState  <=  WriteState; 

else 

nextState  <=  Statel; 
end  if; 


end  case; 

end  process  nxtStProc; 

--  Process  to  register  the  current  state 
curStProc:  process  (clk_r,  reset_r) 
begin 


if  (reset  r  ='0')  then 
currState  <=  State; 

elsif  (elk  r ' event  and  elk  r= ' 1 ' )  then 
currState  <=  nextState; 
end  if; 

end  process  curStProc; 

--  Process  to  generate  outputs 

outConProc:  process  (currState,  wr_r,  pc_r,  datain_a,  datain_b, 

datain  c,  addrin  r,  mem  data  wr) 


begin 

case  currState  is 
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without  this  state,  state  machine  starts  at  Statel  after  reset 


when  State  =>  --  generated  for  reset  only 

null ; 

when  StateO  =>  --  doing  instruction  fetch 

state_r  <=  "0000"; 
wrout_r  <=  wr_r; 

addrout  r  <=  pc  r;  --  sending  pc  to  memory 

if  (datain  a(23  downto  16) ="11111000")  then 

instr_data_a  <=  "000000000000000000000000"; 
instr_data_b  <=  "000000000000000000000000"; 
instr_data_c  <=  "000000000000000000000000"; 

else 

instr  data  a  <=  datain  a;--  memory  sends  instruction  to  KDLX 
instr  data  b  <=  datain  b; 
instr_data_c  <=  datain_c; 
end  if; 

dataout_r  <=  (others  =>  'Z'); 
mem  data  a  <=  (others  =>  'Z'); 
mem  data_b  <=  (others  =>  'Z'); 
mem  data  c  <=  (others  =>  'Z'); 

when  Statel  => 

state_r  <=  "0001"; 
wrout  r  <=  wr  r; 
addrout_r  <=  pc_r; 

if  (datain  a(23  downto  16) ="11111000")  then 

instr_data_a  <=  "000000000000000000000000"; 
instr_data_b  <=  "000000000000000000000000"; 
instr_data_c  <=  "000000000000000000000000"; 
else  --  memory  sends  instruction  to  KDLX 

instr_data_a  <=  datain_a; 
instr  data  b  <=  datain  b; 
instr_data_c  <=  datain_c; 
end  if; 

dataout_r  <=  (others  =>  'Z'); 
mem  data_a  <=  (others  =>  'Z'); 
mem  data  b  <=  (others  =>  'Z'); 
mem  data  c  <=  (others  =>  'Z'); 


when  ReadState  => 

state_r  <=  "0010"; 
wrout  r  <=  wr  r; 
addrout_r  <=  addrin_r; 

--  memory  sends  data  to  KDLX 
mem_data_a  <=  datain_a(15  downto  0) ; 
mem  data  b  <=  datain  b(15  downto  0) ; 
mem_data_c  <=  datain_c(15  downto  0) ; 
dataout_r  <=  (others  =>  'Z'); 
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--  When  KDLX  reads  data  from  memory 

--  write  signal  is  one 
--  sending  address  to  memory 


--  exactly  the  same  as  StateO 
--  for  keeping  current  state 


block  input  to  memory 


When  KDLX  writes  data  to  memory 


when  WriteState  => 

state_r  <=  "0011"; 
wrout  r  <=  wr  r;  --  write  signal  is  zero 

addrout  r  <=  addrin  r;  --  sending  address  to  memory 

--  KDLX  sends  data  to  memory 
dataout  r(15  downto  0)  <=  mem  data  wr; 

dataout_r(23  downto  16)  <=  "00000000";  --  sign  extension  data 

end  case; 

end  process  outConProc; 
end  fsm; 
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D,  ESSD 


_ -k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-kif-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k-k 

--  Module:  Error  Syndrome  Storage  Device  (ESSD) 

--  Function:  The  ESSD  is  used  to  store  error  syndrome  when  err  occurs. 
--  It  runs  in  double  speed  and  has  the  same  time  constraints  with 
--  Reconciler.  Stall  KDLX  at  the  beginning  of  ISR. 

--  Notation:  This  ESSD  works  with  the  TMRA  in  this  design  only.  This 
--  is  the  final  version. 

--  Author:  Rong  Yuan,  TWAF 

--  Date:  Nov  21,  2003 

_ kkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk 


library  IEEE; 

use  IEEE. STD_LOGIC_1164. ALL; 
use  IEEE. STD_LOGIC_ARITH. ALL; 
use  IEEE. STD  LOGIC  UNSIGNED .ALL; 


entity  essd  is  Port  ( 

addr  in:  in  std  logic_vector ( 15  downto  0); 
pc  in:  in  std_logic  vector(15  downto  0); 
cidl  in:  in  std  logic  vector(50  downto  0); 
cidO  in:  in  std  logic  vector(50  downto  0); 
err:  in  std  logic; 
reset_s:  in  std_logic; 
elk  s:  in  std_logic; 

stall_s:  out  std_logic; 
wr_s :  out  std_logic; 
sel_wr:  out  std_logic; 

addr_s :  out  std_logic_vector ( 15  downto  0); 
sel_addr:  out  std_logic_vector ( 15  downto  0); 
sel_s:  out  std_logic_vector (23  downto  0); 
ess:  out  std_logic_vector (23  downto  0); 
state_s:  out  std_logic_vector ( 4  downto  0) 

)  ; 

end  essd; 

architecture  fsm  of  essd  is 

type  targetFSM  is  (State,  State0_A,  State0_B,  LatchState_A, 

LatchState_B,  NopState0_A,  NopState0_B,  NopStatel_A, 
NopStatel_B,  StallState,  StoreState0_A, 
StoreState0_B,  StoreState0_C,  StoreStatel_A, 
StoreStatel_B,  StoreStatel_C,  StoreState_addr , 
StoreState_pc,  BackState) ; 

signal  pc  latch,  addr  latch:  std  logic  vector(15  downto  0); 
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signal  cidO  latchA,  cidO  latchB,  cidO  latchC,  cidl  latchA,  cidl  latchB, 
cidl_latchC:  std_logic_vector (23  downto  0); 
signal  counter:  std  logic_vector ( 15  downto  0); 
signal  currState,  nextState:  targetFSM; 

begin 

nxtStProc:  process  (  currState,  err) 
begin 

case  currState  is 

when  State  => 

nextState  <=  StateO_A; 

when  StateO  A  => 

nextState  <=  StateO_B; 

when  StateO  B  => 
if  (err='l')  then 

nextState  <=  LatchState_A; 

else 

nextState  <=  StateO_A; 
end  if; 

when  LatchState  A  => 

nextState  <=  LatchState_B; 

when  LatchState  B  => 

nextState  <=  NopStateO_A; 

when  NopStateO_A  => 

nextState  <=  NopStateO_B; 

when  NopStateO  B  => 

nextState  <=  NopStatel  A; 

when  NopStatel  A  => 

nextState  <=  NopStatel  B; 

when  NopStatel  B  => 

nextState  <=  Stall State; 

when  StallState  => 

nextState  <=  StoreStateO_A; 

when  StoreStateO  A  => 

nextState  <=  StoreStateO_B; 

when  StoreStateO  B  => 

nextState  <=  StoreStateO_C; 

when  StoreStateO_C  => 

nextState  <=  StoreStatel  A; 
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when  StoreStatel  A  => 

nextState  <=  StoreStatel_B; 

when  StoreStatel  B  => 

nextState  <=  StoreStatel_C; 

when  StoreStatel  C  => 

nextState  <=  StoreState_addr ; 

when  StoreState  addr  => 

nextState  <=  StoreState_pc; 

when  StoreState  pc  => 

nextState  <=  BackState; 

when  BackState  => 

nextState  <=  StateO_A; 

end  case; 

end  process  nxtStProc; 

--  Process  to  register  the  current  state 
curStProc:  process  (clk_s,  reset_s) 
begin 


if  (reset_s  ='0')  then 
currState  <=  State; 

elsif  (elk  s ' event  and  elk  s='l')  then 
currState  <=  nextState; 
end  if; 

end  process  curStProc; 

--  Process  to  generate  outputs 

outConProc:  process  (currState,  pc_in,  addr_in,  cidl_in,  cidO_in) 
begin 

counter  <=  "0000000001011001";  --starting  at  address  0059 

case  currState  is 
when  State  => 
null ; 

when  StateO  A  => 

state_s  <=  "00000"; 

ess  <=  (others  => ' Z ' ) ; 

sel  s  <=  "111111111111111111111111"; 

sel  wr  <=  ' 1 ' ; 

sel~addr  <=  "1111111111111111"; 
stall_s  <=  ' 1 ' ; 

when  StateO  B  => 

state_s  <=  "00001"; 
ess  <=  (others  => ' Z ' ) ; 
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sel_s  <=  "111111111111111111111111"; 
sel  wr  <=  ' 1 '  ; 

sel~addr  <=  "1111111111111111"; 
stall_s  <=  ' 1 ' ; 

when  LatchState  A  =>  --latch  all  data  here 

state_s  <=  "00010"; 

sel_s  <=  "111111111111111111111111"; 
sel  wr  <=  ' 1 ' ; 

sel~addr  <=  "1111111111111111"; 
stall_s  <=  ' 1 ' ; 
pc  latch  <=  pc  in; 
addr_latch  <=  addr  in; 

--seperate  input  data 

cidl  latchC  <=  cidl  in (23  downto  0); 

cidl  latchB  <=  cidl  in(47  downto  24); 

cidl_latchA (2  downto  0)  <=  cidl  in(50  downto  48); 

cidl_latchA (23  downto  3)  <=  "000000000000000000000"; 

cidO  latchC  <=  cidO  in (23  downto  0); 

cidO  latchB  <=  cidO  in(47  downto  24); 

cidO  latchA(2  downto  0)  <=  cidO  in(50  downto  48); 

cid0_latchA (23  downto  3)  <=  "000000000000000000000"; 

when  LatchState  B  => 

state_s  <=  "00011"; 

sel_s  <=  "111111111111111111111111"; 
sel  wr  <=  ' 1 ' ; 

sel~addr  <=  "1111111111111111"; 
stall_s  <=  ' 1 ' ; 

when  NopStateO  A  => 

state_s  <=  "00100"; 

sel_s  <=  "111111111111111111111111"; 
sel  wr  <=  ' 1 ' ; 

sel~addr  <=  "1111111111111111"; 
stall_s  <=  ' 1 ' ; 

when  NopStateO  B  => 

state_s  <=  "00101"; 

sel_s  <=  "111111111111111111111111"; 
sel  wr  <=  ' 1 '  ; 

sel~addr  <=  "1111111111111111"; 
stall_s  <=  ' 1 ' ; 

when  NopStatel  A  => 

state_s  <=  "00110"; 

sel_s  <=  "111111111111111111111111"; 
sel  wr  <=  ' 1 ' ; 

sel~addr  <=  "1111111111111111"; 
stall_s  <=  ' 1 ' ; 

when  NopStatel  B  => 

state_s  <=  "00111"; 

sel_s  <=  "111111111111111111111111"; 
sel  wr  <=  ' 1 '  ; 

sel~addr  <=  "1111111111111111"; 

197 


stall  s  <= 


1'; 


when  StallState  =>  --stall  KDLX 

state_s  <=  "01000"; 

sel  s  <=  "111111111111111111111111" 
sel  wr  <=  ' 1 '  ; 

sel~addr  <=  "1111111111111111"; 
stall_s  <=  'O'; 

when  StoreStateO  A  =>  --store  cidO 

state_s  <=  "01001"; 

sel_s  <=  "000000000000000000000000" 
sel  wr  <=  'O'; 

sel_addr  <=  "0000000000000000"; 

stall_s  <=  'O'; 

addr_s  <=  counter; 

wr  s  <=  'O'; 

ess  <=  cid0_latchC; 

counter  <=  counter-1; 

when  StoreStateO  B  => 

state_s  <=  "01010"; 

sel_s  <=  "000000000000000000000000" 
sel  wr  <=  'O'; 

sel_addr  <=  "0000000000000000"; 

stall_s  <=  'O'; 

addr_s  <=  counter; 

wr  s  <=  'O'; 

ess  <=  cid0_latchB; 

counter  <=  counter-1; 

when  StoreState0_C  => 

state_s  <=  "01011"; 

sel_s  <=  "000000000000000000000000" 
sel  wr  <=  'O'; 

sel_addr  <=  "0000000000000000"; 

stall_s  <=  'O'; 

addr_s  <=  counter; 

wr  s  <=  'O'; 

ess  <=  cid0_latchA; 

counter  <=  counter-1; 

when  StoreStatel_A  =>  --store  cidl 

state_s  <=  "01100"; 

sel_s  <=  "000000000000000000000000" 
sel  wr  <=  'O'; 

sel_addr  <=  "0000000000000000"; 

stall_s  <=  'O'; 

addr_s  <=  counter; 

wr  s  <=  'O'; 

ess  <=  cidl_latchC; 

counter  <=  counter-1; 

when  StoreStatel  B  => 

state_s  <=  "01101"; 

sel_s  <=  "000000000000000000000000" 
sel  wr  <=  'O'; 
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sel_addr  <=  "0000000000000000"; 

stall_s  <=  'O'; 

addr_s  <=  counter; 

wr  s  <=  'O'; 

ess  <=  cidl  latchB; 

counter  <=  counter-1; 

when  StoreStatel_C  => 

state_s  <=  "OHIO"; 

sel_s  <=  "000000000000000000000000"; 
sel  wr  <=  'O'; 

sel_addr  <=  "0000000000000000"; 

stall_s  <=  'O'; 

addr_s  <=  counter; 

wr  s  <=  'O'; 

ess  <=  cidl  latchA; 

counter  <=  counter-1; 

when  StoreState  addr  =>  --store  mem  addr 

state_s  <=  "01111"; 
sel_s  <=  "000000000000000000000000"; 
sel  wr  <=  'O'; 

sel_addr  <=  "0000000000000000"; 
stall_s  <=  'O'; 
addr_s  <=  counter; 
wr  s  <=  'O'; 

ess  (15  downto  0)  <=  addr_latch; 
ess (23  downto  16)  <=  "00000000"; 
counter  <=  counter-1; 

when  StoreState_pc  =>  --store  pc 

state_s  <=  "10000"; 

sel_s  <=  "000000000000000000000000"; 
sel  wr  <=  'O'; 

sel_addr  <=  "0000000000000000"; 
stall_s  <=  'O'; 
addr_s  <=  counter; 
wr  s  <=  'O'; 

ess  (15  downto  0)  <=  pc_latch; 
ess  (23  downto  16)  <=  "00000000"; 
counter  <=  counter-1; 

when  BackState  =>  --release  KDLX 

state_s  <=  "10001"; 

sel  s  <=  "111111111111111111111111"; 
sel  wr  <=  ' 1 '  ; 

sel~addr  <=  "1111111111111111"; 

stall_s  <=  ' 1 ' ; 

addr_s  <=  (others  => ' Z ' ) ; 

wr  s  <=  ' 1 '  ; 

ess  <=  (others  => ' Z ' ) ; 


end  case; 

end  process  outConProc; 
end  fsm; 
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E. 


KDLX 


The  KDLX  is  a  16-bit  RISC  soft-core  processor.  It  is  5-stage  pipelined  including 
fetch,  decode,  execute,  memory,  and  write  back.  The  KDLX  is  coded  by  Dr.  Kenneth 
Clark  and  following  is  the  construction  of  the  source  core  in  ISE  software. 


B  dlx_testbench  (dlx_out.vhd) 

B  2l  dix  (dlx.vhd) 

B  0  (core.vhd) 

B  2)  (alu.vhd) 

B  0  ^dder  (adder,  vhd) 

0  ao22  (A022.vhd) 

;  0  alu_logic  (alu_logic.vhd) 

B  0  log_barrel  (log_barretvhd) 

2l  word_mux4  (word_mux4.vhd) 
i  i  2l  word_mux4  (word_mux4.vhd) 

:  B  0  word_set  (word_set.vhd) 

2l  zefojest  (zero_test.vhd) 

:  B  ■  0  pc_control  (pc_controtvhd) 

2l  incremenl  (increment,  vhd) 

;  0  word_mux3  (word_mux3.vhd) 

:  B  0  word_reg_single  (word_reg_single.vhd) 

0  scan_reg  (scan_reg.vhd) 

;  (3  0  pipeline  (pipeline,  vhd) 

;  B  '  0  twelve_bit_reg_single  (twelve_bit_reg_single.vhd) 

2l  scan_reg  (scan_reg.vhd) 

B  0  twenty_four_bit_reg_single  (twentv_four_bit_reg_single.vhd) 
B  0  twelve_bit_reg_single  (twelve_bit_reg_single.vhd) 

2)  scan_reg  (scan_reg.vhd) 

B  0  regfile  (regfile.vhd) 

0  dest_decoder  (Dest_Decoder.vhd) 

0  word_mux16  (word_mux16.vhd) 
i  B  0  word_teg_single  (word_reg_single.vhd) 

scan_reg  (scan_reg.vhd) 

2)  rw_control  (rw_controtvhd) 

2]  word_mux3  (wiord_mux3.vhd) 

2)  word_mux4  (word_mux4.vhd) 

B  2l  word_regLsingle  (word_reg_single.vhd) 

0  scan_reg  (scan_reg.vhd) 

0  zero_test  (zerojest.vhd) 

0  io_pads  (ID_Pads.vhd) 
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1.  alu.vhd 

LIBRARY  IEEE; 

USE  IEEE.std_logic_II64.all; 

USE  lEEE.std  logic  arith.all; 

USE  lEEE.std  logic  unsigned . all ; 

—  *****  adder  model  ***** 

--  external  ports 
ENTITY  adder  IS  PORT  ( 

A  :  IN  std  logic  vector(15  downto  0); 

B:  IN  std  logic  vector(15  downto  0); 

alu  opl  :  IN  std_logic; 
alu_op3  :  IN  std_logic; 
alu  op4  :  IN  std  logic; 

0ut_word  :  OUT  std_logic_vector ( 15  downto  0) 

)  ; 

END  adder; 

--  internal  structure 
ARCHITECTURE  rtl  OF  adder  IS 

--  COMPONENTS 

COMPONENT  A022 
PORT  ( 

A  :  IN  std  logic; 

B  :  IN  std  logic; 

C  :  IN  std_logic; 

D  :  IN  std  logic; 

\Out\  :  OUT  std_logic 

)  ; 

END  COMPONENT; 

SIGNAL  Vdd  :  std  logic; 

SIGNAL  subtract  :  std  logic; 

--  INSTANCES  ~ 

BEGIN 

Vdd  <=  ' 1  '  ; 

A022_I  :  A022  PORT  MAP ( 

A  =>  Vdd, 

B  =>  alu  opl, 

C  =>  alu_op4, 

D  =>  alu_op3, 

\0ut\  =>  subtract 

)  ; 


process  (A,  B,  subtract) 
begin 

if  (subtract  =  '1')  then 
out  word  <=  A-B; 
else  out  word  <=  A+B; 
end  if; 
end  process; 

END  rtl; 
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2. 


alu.vhd 


LIBRARY  IEEE; 

USE  IEEE.std_logic_II64.all; 

—  *****  model  ***** 

--  external  ports 
ENTITY  alu  IS  PORT  ( 

A  :  IN  std  logic_vector  (15  downto  0); 
alu_op  :  IN  std_logic_vector  (4  downto  0) ; 
alu_out  :  OUT  std_logic_vector  (15  downto  0); 

B  :  IN  std  logic  vector  (15  downto  0) 

)  ; 

END  alu; 

--  internal  structure 
ARCHITECTURE  structural  OF  alu  IS 

--  COMPONENTS 

COMPONENT  adder 
PORT  ( 

A  :  IN  std  logic_vector ( 15  downto  0); 

B  :  IN  std  logic_vector  (15  downto  0); 
alu  opl  :  IN  std  logic; 
alu_op3  :  IN  std_logic; 
alu  op4  :  IN  std  logic; 

Out_word  :  OUT  std_logic_vector  (15  downto  0) 

)  ; 

END  COMPONENT; 

COMPONENT  alu_logic 
PORT  ( 

A  :  IN  std  logic_vector  (15  downto  0); 

B  :  IN  std  logic  vector  (15  downto  0); 

Func  :  IN  std  logic  vector  (1  downto  0); 
logic_out  :  OUT  std_logic_vector  (15  downto  0) 

)  ; 

END  COMPONENT; 

COMPONENT  log_barrel 
PORT  ( 

ar  or  log  :  IN  std  logic; 

In  Word  :  IN  std_logic  vector  (15  downto  0); 

1  or  r  :  IN  std  logic; 

Out_word  :  OUT  std_logic_vector  (15  downto  0) ; 
Shift  :  IN  std  logic  vector  (3  downto  0) 

)  ; 

END  COMPONENT; 

COMPONENT  word_mux4 
PORT  ( 


A 

IN 

std 

logic 

vector 

(15 

downto 

0)  ; 

B 

IN 

std 

logic 

vector 

(15 

downto 

0)  ; 

C 

IN 

std 

logic 

vector 

(15 

downto 

0)  ; 

D 

IN 

std 

logic 

vector 

(15 

downto 

0)  ; 

Sel  :  IN  std_logic_vector  (1  downto  0) ; 
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Out_word  :  OUT  std_logic_vector  (15  downto  0) 

)  ; 

END  COMPONENT; 

COMPONENT  word_set 
PORT  ( 

In  word  :  IN  std  logic  vector  (15  downto  0); 
set_op  :  IN  std_logic_vector  (2  downto  0) ; 
set_out  :  OUT  std_logic 

)  ; 

END  COMPONENT; 

--  SIGNALS 

SIGNAL  set_out  :  std_logic_vector  (15  downto  0); 

SIGNAL  log  barrel  out  :  std  logic  vector  (15  downto  0) 
SIGNAL  logic  out  :  std  logic  vector  (15  downto  0); 
SIGNAL  Adder  Out  :  std  logic  vector  (15  downto  0); 

--  INSTANCES 
BEGIN 

set_out(I5  downto  1)  <=  "000000000000000"; 
halfword  adder  1  :  adder  PORT  MAP ( 

A  =>  A,  ~ 

alu_opl  =>  alu_op(l), 
alu_op3  =>  alu_op(3), 
alu_op4  =>  alu_op(4), 

B  =>  B, 

Out_word  =>  Adder_Out 

)  ; 

halfword  alu  logic  1  :  alu  logic  PORT  MAP  ( 

A  =>  a7  ~  ~ 

B  =>  B, 

Func  =>  alu_op(l  downto  0), 
logic_out  =>  logic_out 

)  ; 

halfword  log  barrel  1  :  log  barrel  PORT  MAP ( 
ar_or_log  =>  alu_op(0). 

In  word  =>  A, 
l_or_r  =>  alu_op(I), 

Out_word  =>  log_barrel_out. 

Shift  =>  B(3  downto  0) 

)  ; 

halfword  mux4  1  :  word  mux4  PORT  MAP ( 

A  =>  Adder_Out, 

B  =>  logic_out, 

C  =>  log_barrel_out, 

D  =>  set_out, 

Out_word  =>  alu_out, 

Sel  =>  alu_op(4  downto  3) 

)  ; 

halfword  set  1  :  word  set  PORT  MAP ( 

In  word  =>  Adder  Out, 
set_op  =>  alu_op(2  downto  0), 
set_out  =>  set_out(0) 

)  ; 

END  structural; 
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3.  alulogic.vhd 

LIBRARY  IEEE; 

USE  IEEE.std_logic_II64.all; 

—  *****  alu_logic  model  ***** 

--  external  ports 

ENTITY  alu_logic  IS  PORT  ( 

A:  IN  std  logic  vector(15  downto  0); 

B  :  IN  std  logic  vector(15  downto  0); 

Func :  IN  std  logic  vector(l  downto  0); 
logic_out  :  OUT  std_logic_vector ( 15  downto  0) 

)  ; 

END  alu  logic; 

--  internal  structure 
ARCHITECTURE  rtl  OF  alu_logic  IS 

BEGIN 

process  (A,B,  func) 
begin 

case  func  is 

when  "00"  =>  logic  out  <=  A; 

when  "01"  =>  logic  out  <=  (A  and  B) ; 

when  "10"  =>  logic  out  <=  (A  or  B)  ; 

when  others  =>  logic  out  <=  (A  xor  B) ; 
end  case; 
end  process; 

END  rtl; 


4,  A022,vhd 

LIBRARY  IEEE; 

USE  IEEE. std_logic_II64. all; 

entity  A022  is  port  ( 

A,  B,  C,  D:  IN  std_logic; 

\Out\  :  OUT  std_logic) ; 
end  A022; 

architecture  behavioral  of  A022  is 
begin 

\Out\  <=  (A  and  B)  or  (C  and  D)  ; 
end  behavioral; 
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5. 


core.vhd 


LIBRARY  IEEE; 

USE  IEEE.std_logic_II64.all; 

USE  lEEE.std  logic  arith.all; 

__  *****  core  model  ***** 

--  external  ports 
ENTITY  core  IS  PORT  ( 

Addr_Int  :  OUT  std  logic_vector ( 15  downto  0); 
Clock  in  :  IN  std  logic; 

Input  Data  :  IN  std  logic  vector (15  downto  0) 
Output_Data  :  Out  std_logic_vector ( 15  downto  0); 
Instr  :  IN  std  logic  vector (23  downto  0); 

PC  :  OUT  std_logic_vector ( 15  downto  0); 
Prog_Rd  :  OUT  std_logic; 

Rd  :  OUT  std_logic; 

Resetn  :  IN  std  logic; 

Stalin  :  IN  std  logic; 

Wr  :  OUT  std_logic 

)  ; 

END  core; 

--  internal  structure 
ARCHITECTURE  structural  OF  core  IS 

--  COMPONENTS 

COMPONENT  alu 
PORT  ( 

A  :  IN  std  logic  vector(15  downto  0); 
alu_op  :  IN  std_logic_vector ( 4  downto  0); 
alu_out  :  OUT  std_logic_vector ( 15  downto  0); 

B  :  IN  std  logic  vector (15  downto  0) 

)  ; 

END  COMPONENT; 


COMPONENT  word_mux3 
PORT  ( 

A  :  IN  std  logic  vector(15  downto  0); 

B  :  IN  std  logic  vector(15  downto  0); 

C  :  IN  std_logic_vector ( 15  downto  0); 

Out_word  :  OUT  std_logic_vector ( 15  downto  0) ; 
Sel  :  IN  std_logic  vector (1  downto  0) 

)  ; 

END  COMPONENT; 

COMPONENT  word_mux4 
PORT  ( 

A  :  IN  std  logic  vector(15  downto  0); 

B  :  IN  std  logic  vector(15  downto  0); 

C  :  IN  std_logic_vector ( 15  downto  0); 

D  :  IN  std  logic_vector ( 15  downto  0); 

Out_word  :  OUT  std_logic_vector ( 15  downto  0) ; 
Sel  :  IN  std  logic  vector (1  downto  0) 

)  ; 

END  COMPONENT; 
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COMPONENT  regfile 
PORT  ( 

A  :  OUT  std_logic_vector ( 15  downto  0); 

B  :  OUT  std_logic_vector ( 15  downto  0); 
clock  :  IN  std  logic; 

Data  In  :  IN  std  logic  vector(15  downto  0); 

Best  :  IN  std_logic_vector ( 3  downto  0); 

Stalin:  IN  std  logic; 
resetn  :  IN  std  logic; 

RSone  :  IN  std  logic  vector (3  downto  0); 

RStwo  :  IN  std  logic  vector (3  downto  0); 
scan  data  in  :  IN  std  logic; 
scan  enable  :  IN  std  logic; 
wb  enable  :  IN  std  logic 
) ;  ~ 

END  COMPONENT; 

COMPONENT  word  reg  single 
PORT  ( 

Clock  :  IN  std  logic; 

Data  In  :  IN  std  logic  vector(15  downto  0); 
Data_out  :  OUT  std_logic_vector ( 15  downto  0); 
Enable  :  IN  std  logic; 

Resetn  :  IN  std  logic; 

Scan  Data  In  :  IN  std  logic; 

Scan  Enable  :  IN  std  logic 
) ;  ~ 

END  COMPONENT; 

COMPONENT  pc_control 
PORT  ( 

ALU  Out  :  IN  std_logic  vector(15  downto  0); 
Clock  :  IN  std  logic; 

D2  Inc  PC  :  OUT  std  logic  vector(15  downto  0); 

D  Link  PC  :  OUT  std  logic  vector (15  downto  0) 
lAR  Enable  :  IN  std  logic; 

PC  :  OUT  std_logic_vector ( 15  downto  0); 

PC  Sel  :  IN  std  logic  vector(l  downto  0); 

Resetn  :  IN  std  logic; 

Scan  Data  In  :  IN  std  logic; 

Scan_Data_Out  :  OUT  std_logic; 

Scan_Enable  :  IN  std_logic; 

Stalin  :  IN  std  logic 

)  ; 

END  COMPONENT; 

COMPONENT  pipeline 
PORT  ( 

alu_op  :  OUT  std_logic_vector ( 4  downto  0); 

A  Mux  :  OUT  std  logic  vector (1  downto  0); 

B  Mux  :  OUT  std  logic  vector (1  downto  0); 
Clock  :  IN  std_logic; 

Data  In  :  IN  std  logic_vector (23  downto  0); 
Best  :  OUT  std_logic_vector ( 3  downto  0); 

Immed  :  OUT  std  logic  vector(15  downto  0); 
PC_Sel  :  OUT  std_logic_vector ( 1  downto  0); 
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rd_enable  :  OUT  std_logic; 

Reg  In  Sel  :  OUT  std  logic  vector(l  downto  0) ; 
Resetn  :  IN  std_logic; 

RSone  :  OUT  std_logic_vector ( 3  downto  0); 

RStwo  :  OUT  std_logic_vector ( 3  downto  0); 

Scan  Data  In  :  IN  std  logic; 

Scan  Enable  :  IN  std  logic; 

Stalin  :  IN  std  logic; 
wb  enable  :  OUT  std  logic; 
scan_out  :  OUT  std_logic; 
lAR  Enable  :  OUT  std  logic; 
wr  enable  :  OUT  std  logic; 
zero  flag  :  IN  std_logic 

)  ; 

END  COMPONENT; 

COMPONENT  rw_control 
PORT  ( 

Clock  :  IN  std  logic; 

Prog_Rd  :  OUT  std_logic; 

Rd  :  OUT  std_logic; 
rd  enable  :  IN  std  logic; 
resetn  :  IN  std_logic; 

Stalin  :  IN  std  logic; 

Wr  :  OUT  std_logic; 
wr  enable  :  IN  std  logic 
) ;  ~ 

END  COMPONENT; 


COMPONENT  zero_test 
PORT  ( 

In  word  :  IN  std_logic  vector(15  downto  0); 
zero  flag  :  OUT  std  logic 

)  ; 

END  COMPONENT; 

--  SIGNALS 

SIGNAL  wr  enable  :  std  logic; 

SIGNAL  zero  flag  :  std  logic; 

SIGNAL  IAR_Enable  :  std_logic; 

SIGNAL  wb  enable  :  std_logic; 

SIGNAL  pipeline  scan  out  :  std  logic; 

SIGNAL  Dest  :  std_logic_vector ( 3  downto  0); 

SIGNAL  A  :  std  logic  vector(15  downto  0); 

SIGNAL  D2  Inc  PC  :  std  logic  vector(15  downto  0); 
SIGNAL  Immed  :  std_logic  vector(15  downto  0); 

SIGNAL  D  ALU  Out  :  std_logic  vector(15  downto  0); 
SIGNAL  D  Link  PC  :  std  logic  vector(15  downto  0); 
SIGNAL  Reg  In  Sel  :  std  logic  vector(l  downto  0); 
SIGNAL  ALU  A  :  std  logic  vector(15  downto  0); 

SIGNAL  ALU_Out  :  std  logic_vector ( 15  downto  0); 
SIGNAL  ALU_B  :  std_logic  vector(15  downto  0); 

SIGNAL  Gnd  :  std  logic; 

SIGNAL  B  :  std  logic  vector(15  downto  0); 

SIGNAL  LD  Memory  In  :  std  logic  vector(15  downto  0); 

207 


SIGNAL  output  en  n  :  std  logic; 

SIGNAL  rd  enable  :  std  logic; 

SIGNAL  pc  control  scan  out  :  std  logic; 

SIGNAL  Buf  Stalin  :  std  logic; 

SIGNAL  Buf  resetn  :  std  logic; 

SIGNAL  Clock  :  std  logic; 

SIGNAL  Buf  Addr  Int  :  std  logic  vector(15  downto  0); 

SIGNAL  Shift  En  :  std  logic; 

SIGNAL  alu  op  :  std  logic  vector (4  downto  0); 

SIGNAL  Buf  Scan  Data  Out  :  std  logic; 

SIGNAL  A  Mux  :  std  logic  vector(l  downto  0); 

SIGNAL  B  Mux  :  std  logic  vector(l  downto  0); 

SIGNAL  RSone  :  std_logic  vector (3  downto  0); 

SIGNAL  RStwo  :  std  logic  vector (3  downto  0); 

SIGNAL  PC  Sel  :  std  logic  vector(l  downto  0); 

SIGNAL  Data  Out  :  std  logic  vector(15  downto  0); 

SIGNAL  Regfile  In  :  std  logic  vector(15  downto  0); 

SIGNAL  zero_byte  :  std_logic  vector (7  downto  0); 

SIGNAL  Data  In  :  std  logic  vector(15  downto  0); 

SIGNAL  sign  ext  immed  :  std  logic  vector(15  downto  0); 

SIGNAL  scan  data  in  :  std  logic; 

--  INSTANCES  ~  ~ 

BEGIN 

clock  <=  clock  in; 
shift  en  <=  'O'; 
scan_data_in  <=  'O'; 

Addr  Int  <=  Buf  Addr  Int; 
zero_byte  <=  "00000000"; 

sign  ext  immed (15  downto  8)  <=  Immed (7)  &  Immed (7)  &  Immed (7)  & 

Immed (7)  &  Immed (7)  &  Immed (7)  &  Immed (7)  &  Immed (7); 

sign  ext  immed  (7  downto  0)  <=  Immed (7  downto  0); 

Wr  <=  output  en  n; 

Output_Data  <=  Data_Out; 

Word  Reg  I  :  word  reg  single  PORT  MAP ( 

Clock  =>  Clock, 

Data  In  =>  B, 

Data_out  =>  Data_Out, 

Enable  =>  Stalin, 

Resetn  =>  Resetn, 

Scan  Data  In  =>  pc  control  scan  out. 

Scan  Enable  =>  Shift  En 
) ;  ~ 

Word  Reg  2  :  word  reg  single  PORT  MAP  ( 

Clock  =>  Clock, 

Data  In  =>  Input  Data, 

Data  out  =>  LD  Memory  In, 

Enable  =>  Stalin, 

Resetn  =>  Resetn, 

Scan_Data_In  =>  Data_Out (15) , 

Scan  Enable  =>  Shift  En 
) ;  ~ 
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alu_l  :  alu  PORT  MAP ( 

A  =>  ALU_A, 
alu_op  =>  alu_op, 
alu_out  =>  ALU_Out, 

B  =>  ALU_B 

)  ; 

word  mux3  1  :  word  mux3  PORT  MAP ( 

~  A  =>  D_ALU_Out, 

B  =>  LD  Memory  In, 

C  =>  D_Link_PC, 

Out  word  =>  Regfile  In, 

Sel  =>  Reg  In  Sel 

)  ; 

word  mux3  2  :  word  mux3  PORT  MAP ( 

A  =>  B, 

B(7  downto  0)  =>  Immed(7  downto  0), 

B(15  downto  8)  =>  zero_byte, 

C  =>  sign  ext  immed. 

Out  word  =>  ALU  B, 

Sel  =>  B  Mux 

)  ; 

word  mux4  1  :  word  mux4  PORT  MAP ( 

~  A  =>  A,  ~ 

B  =>  D2_Inc_PC, 

0(7  downto  0)  =>  zero_byte, 

0(15  downto  8)  =>  Immed (7  downto  0), 

D  =>  Immed (15  downto  0), 

Out  word  =>  ALU  A, 

Sel  =>  A  Mux 

)  ; 

regfile  1  :  regfile  PORT  MAP ( 

A~=>  A, 

B  =>  B, 

clock  =>  Olock, 

Data  In  =>  regfile  in. 

Best  =>  Best, 

Stalin  =>  Stalin, 
resetn  =>  resetn, 

RSone  =>  RSone, 

RStwo  =>  RStwo, 

scan  data  in  =>  pipeline  scan  out, 
scan  enable  =>  Shift  En, 
wb  enable  =>  wb  enable 

) ;  ~ 

word  reg  single  3  :  word  reg  single  PORT  MAP ( 
Olock  =>  Olock, 

Data  In  =>  Buf  Addr  Int, 

Data  out  =>  D  ALU  Out, 

Enable  =>  Stalin, 

Resetn  =>  resetn. 

Scan  Data  In  =>  Buf  Addr  Int (15), 

Scan  Enable  =>  Shift  En 

) ;  ~ 

word  reg  single  4  :  word  reg  single  PORT  MAP ( 
Olock  =>  Olock, 

Data  In  =>  ALU_Out, 

Data  out  =>  Buf  Addr  Int, 
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Enable  =>  Stalin, 

Resetn  =>  resetn. 

Scan  Data  In  =>  B(15), 

Scan  Enable  =>  Shift  En 

) ;  ~ 

pc  control  1  :  pc  control  PORT  MAP ( 

~  ALU_Out  =>  ALU_Out, 

Clock  =>  Clock, 

D2_Inc_PC  =>  D2_Inc_PC, 

D_Link_PC  =>  D_Link_PC, 
lAR  Enable  =>  lAR  Enable, 

PC  =>  PC, 

PC_Sel  =>  PC_Sel, 

Resetn  =>  resetn. 

Scan  Data  In  =>  D  ALU  Out (15), 
Scan_Data_Out  =>  pc_control_scan_out. 
Scan  Enable  =>  Shift  En, 

Stalin  =>  Stalin 

)  ; 

pipeline  1  :  pipeline  PORT  MAP ( 
alu_op  =>  alu_op, 

A_Mux  =>  A_Mux, 

B_Mux  =>  B_Mux, 

Clock  =>  Clock, 

Data  In  =>  Instr, 

Dest  =>  Dest, 

Immed  =>  Immed, 

PC_Sel  =>  PC_Sel, 
rd  enable  =>  rd  enable, 

Reg  In  Sel  =>  Reg  In  Sel, 

Resetn  =>  resetn, 

RSone  =>  RSone, 

RStwo  =>  RStwo, 

Scan  Data  In  =>  Scan  Data  In, 

Scan  Enable  =>  Shift  En, 

Stalin  =>  Stalin, 
wb  enable  =>  wb  enable, 
scan_out  =>  pipeline  scan  out, 
lAR  Enable  =>  lAR  Enable, 
wr  enable  =>  wr  enable, 
zero  flag  =>  zero  flag 

)  ; 

rw_control_l  :  rw  control  PORT  MAP ( 

Clock  =>  Clock, 

Prog  Rd  =>  Prog  Rd, 

Rd  =>  Rd,  ~ 

rd  enable  =>  rd  enable, 
resetn  =>  resetn, 

Stalin  =>  Stalin, 

Wr  =>  output  en  n, 
wr  enable  =>  wr  enable 

) ;  ~ 

zero  test  1  :  zero_test  PORT  MAP ( 

In  word  =>  A, 

zero  flag  =>  zero  flag 

)  ; 

END  structural; 
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6,  DestDecoder.vhd 

LIBRARY  IEEE; 

USE  IEEE.std_logic_II64.all; 

—  *****  Dest_Decoder  model  ***** 

--  external  ports 

ENTITY  Dest_Decoder  IS  PORT  ( 

Best  :  IN  std_logic_vector ( 3  downto  0); 
Enable  :  OUT  std  logic  vector(15  downto  1); 
wb  enable  :  IN  std  logic 
) ;  ~ 

END  Best  Decoder; 

--  internal  structure 
ARCHITECTURE  rtl  OF  Dest_Decoder  IS 

--  SIGNALS 

SIGNAL  buf_enable  :  std  logic  vector(15  downto  1); 

--  INSTANCES 
BEGIN 

with  dest  select 

buf  enable  <=  "000000000000001"  when  "0001", 

"000000000000010"  when  "0010", 
"000000000000100"  when  "0011", 
"000000000001000"  when  "0100", 
"000000000010000"  when  "0101", 
"000000000100000"  when  "0110", 
"000000001000000"  when  "0111", 
"000000010000000"  when  "1000", 
"000000100000000"  when  "1001", 
"000001000000000"  when  "1010", 
"000010000000000"  when  "1011", 
"000100000000000"  when  "1100", 
"001000000000000"  when  "1101", 
"010000000000000"  when  "1110", 
"100000000000000"  when  others; 

Enable  <=  buf  enable  when  (wb  enable  =  ' 1 ' )  else 
"000000000000000"; 

END  rtl; 
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7. 


dlx.vhd 


LIBRARY  IEEE; 

USE  IEEE.std_logic_II64.all; 

USE  lEEE.std  logic  arith.all; 

__  *****  model  ***** 

--  external  ports 

ENTITY  dlx  IS  PORT  ( 

Addr  Int  :  OUT  std  logic  vector(15  downto  0); 
Clock  in  :  IN  std  logic; 

Data  :  INOUT  std_logic_vector ( 15  downto  0); 
Instr  :  IN  std  logic  vector (23  downto  0); 

PC  :  OUT  std_logic_vector ( 15  downto  0); 

Prog_Rd  :  OUT  std_logic; 

Rd  :  OUT  std_logic; 

Resetn  :  IN  std_logic; 

Stalin  :  IN  std  logic; 

Wr  :  OUT  std  logic 

)  ; 

END  dlx; 

--  internal  structure 

ARCHITECTURE  structural  OF  dlx  IS 

--  COMPONENTS 
COMPONENT  core 

PORT  ( 

Addr_Int  :  OUT  std  logic_vector ( 15  downto  0); 
Clock  in  :  IN  std_logic; 

Input  Data  :  IN  std  logic  vector(15  downto  0); 
Output_Data  :  Out  std_logic_vector ( 15  downto  0) 
Instr  :  IN  std  logic  vector (23  downto  0); 

PC  :  OUT  std_logic_vector ( 15  downto  0); 

Prog_Rd  :  OUT  std_logic; 

Rd  :  OUT  std_logic; 

Resetn  :  IN  std  logic; 

Stalin  :  IN  std  logic; 

Wr  :  OUT  std  logic 

)  ; 

END  COMPONENT; 

COMPONENT  IO_Pads 
PORT  ( 

Pads  :  INOUT  std_logic_vector  (15  downto  0); 
In_Data  :  OUT  std_logic_vector  (15  downto  0); 
Out_Data  :  IN  std_logic_vector  (15  downto  0); 
Output  En_n  :  IN  std_logic 

)  ; 

END  COMPONENT; 
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--  SIGNALS 

signal  Input  data  :  std  logic  vector(15  downto  0) ; 
signal  Output_data  :  std_logic_vector ( 15  downto  0) 
signal  wr  int  :  std  logic; 

--  INSTANCES 
BEGIN 

wr  <=  wr  int; 

Corel  :  core  PORT  MAP ( 

Addr  Int  =>  Addr  Int, 

Clock;_in  =>  Clock  In, 

Input  Data  =>  Input  data, 

Output_Data  =>  Output_data, 

Instr  =>  Instr, 

PC  =>  PC, 

Prog_Rd  =>  Prog  Rd, 

Rd  =>  Rd,  ~ 

Resetn  =>  Resetn, 

Stalin  =>  Stalin, 

Wr  =>  Wr  int 

)  ; 


IO_Pads_l  :  IO_Pads  PORT  MAP ( 
Pads  =>  Data, 

In  Data  =>  Input  Data, 
Out_Data  =>  Output_Data, 
Output  En  n  =>  wr  int 

)  ; 


END  structural; 


8,  dlx  out.vhd 


--  Test  bench  shell 
library  ieee; 

use  ieee . std_logic_l 1 64 . all ; 
use  ieee . numeric_std . all ; 

entity  dlx_testbench  is  end  dlx  testbench; 
architecture  testbench  of  dlx  testbench  is 
--  Declaration  of  the  component  under  test 
component  DLX 
port  ( 

Addr_Int  :  OUT  std  logic_vector ( 15  downto  0) 
Clock  in  :  IN  std_logic; 

Data  :  INOUT  std_logic_vector ( 15  downto  0); 
Instr  :  IN  std  logic  vector (23  downto  0); 

PC  :  OUT  std_logic_vector ( 15  downto  0); 


213 


Prog_Rd  :  OUT  std_logic; 

Rd  :  OUT  std_logic; 

Resetn  :  IN  std  logic; 

Stalin  :  IN  std  logic; 

Wr  :  OUT  std  logic 

)  ; 

end  component; 

signal  addr  int  :  std  logic  vector(15  downto  0) ; 

signal  instr  :  std  logic  vector (23  downto  0); 

signal  pc  :  std_logic_vector ( 15  downto  0); 

signal  data  :  std_logic_vector ( 15  downto  0); 

signal  resetn  :  std  logic; 

signal  prog  rd  :  std  logic; 

signal  rd  :  std  logic; 

signal  wr  :  std  logic; 

signal  stalln  :  std  logic; 

signal  clock  in  :  std  logic; 

begin 

process  -  10  MHz  clock 

begin 

clock  in  <=  'O'; 
wait  for  25  ns; 
clock  in  <=  'O'; 
wait  for  25  ns; 
clock  in  <=  ' 1 '  ; 
wait  for  25  ns; 
clock  in  <=  'O'; 
wait  for  25  ns; 
end  process; 


process 

begin  -  power  up  reset  process 

wait  for  1  ns; 

resetn  <=  'O'; 
stalln  <=  ' 1 '  ; 

wait  for  10  ns; 

resetn  <=  ' 1 '  ; 
wait; 

end  process; 
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process 


begin 

wait  for 

1  ns ; 

instr  <= 

X"000000"; 

— 

■  NOP 

data  <= 

"ZZZZZZZZZZZZZZZZ 

n  • 

1  r 

wait  for 

100  ns; 

instr  <= 

X"080101"; 

— 

LHI 

RI, 

#1 

wait  for 

100  ns; 

instr  <= 

X"  08 02  02"; 

— 

LHI 

R2, 

#2 

wait  for 

100  ns; 

instr  <= 

X"080303"; 

— 

LHI 

R3, 

#3 

wait  for 

100  ns; 

instr  <= 

X"080404"; 

— 

LHI 

R4, 

#4 

wait  for 

100  ns; 

instr  <= 

X"080505"; 

— 

LHI 

LO 

#5 

wait  for 

100  ns; 

instr  <= 

X"080606"; 

— 

LHI 

R6, 

#6 

wait  for 

100  ns; 

instr  <= 

X"080707"; 

— 

LHI 

R7, 

#7 

wait  for 

100  ns; 

instr  <= 

X"080808"; 

— 

LHI 

CO 

#8 

wait  for 

100  ns; 

instr  <= 

X"080909"; 

— 

LHI 

R9, 

#9 

wait  for 

100  ns; 

instr  <= 

X"080A0A"; 

— 

LHI 

RIO, 

#10 

wait  for 

100  ns; 

instr  <= 

X"080B0B"; 

— 

LHI 

RII, 

#11 

wait  for 

100  ns; 

instr  <= 

X"080C0C"; 

— 

LHI 

R12, 

#12 
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wait  for 

100  ns; 

instr  <= 

X"080D0D"; 

- LHI 

R13, 

#13 

wait  for 

100  ns; 

instr  <= 

X"080E0E"; 

- LHI 

R14, 

#14 

wait  for 

100  ns; 

instr  <= 

X"080F0F"; 

- LHI 

R15, 

#15 

wait  for 

100  ns; 

instr  <= 

X'MlllFE"; 

- ADDI 

Rl, 

Rl, 

FE 

wait  for 

100  ns; 

instr  <= 

X"2122FD"; 

-  ADDUI  R2 

,  R2 

,  FD 

wait  for 

100  ns; 

instr  <= 

X"013340"; 

- ADD 

R3, 

R3, 

R4 

wait  for 

100  ns; 

instr  <= 

X"4344FF"; 

- SUBI 

R4, 

R4, 

FF 

wait  for 

100  ns; 

instr  <= 

X"235501"; 

-  SUBUI  R5 

LO 

,  #1 

wait  for 

100  ns; 

instr  <= 

X"036670"; 

- SUB 

R6, 

R6, 

R7 

wait  for 

100  ns; 

instr  <= 

X"2977FF"; 

- AND  I 

R7, 

R7, 

FF 

wait  for 

100  ns; 

instr  <= 

X"098880"; 

- AND 

CO 

CO 

R9 

wait  for 

100  ns; 

instr  <= 

X"2A99FF"; 

- ORI 

R9, 

R9, 

FF 

wait  for 

100  ns; 

instr  <= 

X"0AAAB0"; 

-  OR  RIO, 

RIO, 

Rll 

wait  for 

100  ns; 

instr  <= 

X"2BBBF0"; 

- XORI 

Rll 

,  Rll,  FO 
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wait  for 

100  ns; 

instr  <= 

X"0BCCD0"; 

-  XOR  R12 

,  R12, 

wait  for 

100  ns; 

instr  <= 

X"450100"; 

- SW 

RO, 

Rl 

wait  for 

100  ns; 

instr  <= 

X"451200"; 

- SW 

Rl, 

R2 

wait  for 

100  ns; 

instr  <= 

X"452300"; 

- SW 

R2, 

R3 

wait  for 

100  ns; 

instr  <= 

X"453400"; 

- SW 

R3, 

R4 

wait  for 

100  ns; 

instr  <= 

X"454500"; 

- SW 

R4, 

R5 

wait  for 

100  ns; 

instr  <= 

X"455600"; 

- SW 

LO 

R6 

wait  for 

100  ns; 

instr  <= 

X"456700"; 

- SW 

R6, 

R7 

wait  for 

100  ns; 

instr  <= 

X"457800"; 

- SW 

R7, 

R8 

wait  for 

100  ns; 

instr  <= 

X"458900"; 

- SW 

CO 

R9 

wait  for 

100  ns; 

instr  <= 

X"459A00"; 

- SW 

R9, 

RIO 

wait  for 

100  ns; 

instr  <= 

X"45AB00"; 

- SW 

RIO, 

Rll 

wait  for 

100  ns; 

instr  <= 

X"45BC00"; 

- SW 

Rll, 

R12 

wait  for 

100  ns; 

instr  <= 

X"45CD00"; 

- SW 

R12, 

R13 

wait  for 

100  ns; 

R13 
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instr  <= 

X"311104"; 

- SLLI 

Rl, 

\ — 1 

#4 

wait  for 

100  ns; 

instr  <= 

X"112240"; 

- SLL 

R2, 

R2, 

R4 

wait  for 

100  ns; 

instr  <= 

X"326304"; 

- SRLI 

R3, 

R6, 

#4 

wait  for 

100  ns; 

instr  <= 

X"126440"; 

- SRL 

R4,  R6, R4 

wait  for 

100  ns; 

instr  <= 

X"336504"; 

- SRAI 

R5 , 

R6, 

#4 

wait  for 

100  ns; 

instr  <= 

X"136640"; 

- SRA 

R6, 

R6, 

R4 

wait  for 

100  ns; 

instr  <= 

X"387701"; 

- SEQI 

R7, 

R7, 

#1 

wait  for 

100  ns; 

instr  <= 

X"387800"; 

- SEQI 

R8, 

R7, 

#0 

wait  for 

100  ns; 

instr  <= 

X"3D7900"; 

- SNEI 

R9, 

R7, 

#0 

wait  for 

100  ns; 

instr  <= 

X"3D7A01"; 

- SNEI 

RIO 

,  R7 

,  #1 

wait  for 

100  ns; 

instr  <= 

X"1D1B10"; 

- SNE 

Rll, 

Rl, 

Rl 

wait  for 

100  ns; 

instr  <= 

X"1D1C20"; 

- SNE 

R12, 

Rl, 

R2 

wait  for 

100  ns; 

instr  <= 

X"3C7D00"; 

- SLTI 

R13 

,  R7 

,  #0 

wait  for 

100  ns; 

instr  <= 

X"3C7E01"; 

- SLTI 

R13 

,  R7 

,  #0 

wait  for 

100  ns; 
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instr  <= 

X"450100"; 

- SW 

RO, 

Rl 

wait  for 

100  ns; 

instr  <= 

X"451200"; 

- SW 

\ — 1 

R2 

wait  for 

100  ns; 

instr  <= 

X"452300"; 

- SW 

R2, 

R3 

wait  for 

100  ns; 

instr  <= 

X"453400"; 

- SW 

R3, 

R4 

wait  for 

100  ns; 

instr  <= 

X"454500"; 

- SW 

R4, 

R5 

wait  for 

100  ns; 

instr  <= 

X"455600"; 

- SW 

R5 , 

R6 

wait  for 

100  ns; 

instr  <= 

X"456700"; 

- SW 

R6, 

R7 

wait  for 

100  ns; 

instr  <= 

X"457800"; 

- SW 

R7, 

R8 

wait  for 

100  ns; 

instr  <= 

X"458900"; 

- SW 

CO 

R9 

wait  for 

100  ns; 

instr  <= 

X"459A00"; 

- SW 

R9, 

RIO 

wait  for 

100  ns; 

instr  <= 

X"45AB00"; 

- SW 

RIO, 

Rll 

wait  for 

100  ns; 

instr  <= 

X"45BC00"; 

- SW 

Rll, 

R12 

wait  for 

100  ns; 

instr  <= 

X"45CD00"; 

- SW 

R12, 

R13 

wait  for 

100  ns; 

instr  <= 

X"45DE00"; 

- SW 

R13, 

R14 

wait  for 

100  ns; 

instr  <= 

X"  187 180"; 

-  SEQ  Rl, 
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R7, 

R8 


wait  for  100  ns; 


instr  <=  X"187290";  -  SEQ  R2 ,  R7 ,  R9 

wait  for  100  ns; 

instr  <=  X"1C7360";  -  SLT  R3,  R7 ,  R6 

wait  for  100  ns; 

instr  <=  X"1C6470";  -  SLT  R4 ,  R6,  R7 

wait  for  100  ns; 

instr  <=  X"1A6570";  -  SGT  R5,  R6,  R7 

wait  for  100  ns; 

instr  <=  X"1A7660";  -  SGT  R6,  R7 ,  R6 

wait  for  100  ns; 

instr  <=  X"5A8701";  -  SGTI  R8,  R7 ,  #1 

wait  for  100  ns; 

instr  <=  X"5A8800";  -  SGTI  R8,  R8,  0 

wait  for  100  ns; 

instr  <=  X"5BB9FF";  -  SLEI  R9,  Rll,  FF 

wait  for  100  ns; 

instr  <=  X"5BBA01";  -  SLEI  RIO,  Rll,  #1 

wait  for  100  ns; 

instr  <=  X"5BBB02";  -  SLEI  Rll,  Rll,  #2 

wait  for  100  ns; 

instr  <=  X"1B2C10";  -  SLE  R12,  R2 ,  R1 

wait  for  100  ns; 

instr  <=  X"1B2D40";  -  SLE  R13,  R2 ,  R4 

wait  for  100  ns; 

instr  <=  X"1B1E20";  -  SLE  R14,  Rl,  R2 

wait  for  100  ns; 

instr  <=  X"450100";  -  SW  RO,  Rl 
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wait  for 

100  ns; 

instr  <= 

X"451200"; 

- SW 

Rl, 

R2 

wait  for 

100  ns; 

instr  <= 

X"452300"; 

- SW 

R2, 

R3 

wait  for 

100  ns; 

instr  <= 

X"453400"; 

- SW 

R3, 

R4 

wait  for 

100  ns; 

instr  <= 

X"454500"; 

- SW 

R4, 

R5 

wait  for 

100  ns; 

instr  <= 

X"455600"; 

- SW 

LO 

R6 

wait  for 

100  ns; 

instr  <= 

X"456700"; 

- SW 

R6, 

R7 

wait  for 

100  ns; 

instr  <= 

X"457800"; 

- SW 

R7, 

R8 

wait  for 

100  ns; 

instr  <= 

X"458900"; 

- SW 

CO 

R9 

wait  for 

100  ns; 

instr  <= 

X"459A00"; 

- SW 

R9, 

RIO 

wait  for 

100  ns; 

instr  <= 

X"45AB00"; 

- SW 

RIO, 

Rll 

wait  for 

100  ns; 

instr  <= 

X"45BC00"; 

- SW 

Rll, 

R12 

wait  for 

100  ns; 

instr  <= 

X"45CD00"; 

- SW 

R12, 

R13 

wait  for 

100  ns; 

instr  <= 

X"45DE00"; 

- SW 

R13, 

R14 

wait  for 

100  ns; 

instr  <= 

X"191120"; 

-  SGE  Rl, 

Rl, 

wait  for 

100  ns; 

R2 
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instr  <=  X"192210";  -  SGE  R2 ,  R2 ,  R1 

wait  for  100  ns; 

instr  <=  X"192320";  -  SGE  R3,  R2 ,  R2 

wait  for  100  ns; 

instr  <=  X"595402";  -  SGEI  R4 ,  R5,  #02 

wait  for  100  ns; 

instr  <=  X"5955FF";  -  SGEI  R5,  R5,  FF 

wait  for  100  ns; 

instr  <=  X"596500";  -  SGEI  R6,  R5,  #0 

wait  for  100  ns; 

instr  <=  X"450100";  -  SW  RO,  R1 

wait  for  100  ns; 

instr  <=  X"451200";  -  SW  Rl,  R2 

wait  for  100  ns; 

instr  <=  X"452300";  -  SW  R2 ,  R3 

wait  for  100  ns; 

instr  <=  X"453400";  -  SW  R3,  R4 

wait  for  100  ns; 

instr  <=  X"454500";  -  SW  R4 ,  R5 

wait  for  100  ns; 

instr  <=  X"455600";  -  SW  R5,  R6 

wait  for  100  ns; 

instr  <=  X"C800FF";  -  J  OxOOFF 

wait  for  100  ns; 

instr  <=  X"000000";  -  NOP 

wait  for  100  ns; 

instr  <=  X"000000";  -  NOP 

wait  for  100  ns; 
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instr  <= 
wait  for 
instr  <= 
wait  for 
instr  <= 
wait  for 
instr  <= 
wait  for 
instr  <= 
wait  for 
instr  <= 
wait  for 
instr  <= 
wait  for 
instr  <= 
wait  for 
instr  <= 
wait  for 
instr  <= 
wait  for 
instr  <= 
wait  for 
instr  <= 
wait  for 
instr 
wait  for 
instr  <= 
wait  for 
instr  <= 


X"000000"; 
100  ns; 
X"000000"; 
100  ns; 
X"E88000"; 
100  ns; 
X"000000"; 
100  ns; 
X"000000"; 
100  ns; 
X"000000"; 
100  ns; 
X"000000"; 
100  ns; 
X"450F00"; 
100  ns; 
X"C1200F"; 
100  ns; 
X"000000"; 
100  ns; 
X"000000"; 
100  ns; 
X"000000"; 
100  ns; 

^  X"000000"; 
100  ns; 
X"C1000F"; 
100  ns; 
X"000000"; 


- NOP 


- NOP 


-  JAL  0x8000 


- NOP 


- NOP 


- NOP 


- NOP 


-  SW  RO,  R15 


-  BEQZ  R2,  OxOF 


- NOP 


- NOP 


- NOP 


- NOP 


-  BEQZ  RO,  OxOF 


- NOP 
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wait  for  100  ns; 


instr  <=  X"000000"; 
wait  for  100  ns; 
instr  <=  X"000000"; 
wait  for  100  ns; 

instr  <=  X"000000"; 
wait  for  100  ns; 
instr  <=  X"C0000F"; 
wait  for  100  ns; 
instr  <=  X"000000"; 
wait  for  100  ns; 
instr  <=  X"000000"; 
wait  for  100  ns; 
instr  <=  X"000000"; 
wait  for  100  ns; 

instr  <=  X"000000"; 
wait  for  100  ns; 
instr  <=  X"C0200F"; 
wait  for  100  ns; 
instr  <=  X"000000"; 
wait  for  100  ns; 
instr  <=  X"000000"; 
wait  for  100  ns; 
instr  <=  X"000000"; 
wait  for  100  ns; 

instr  <=  X"000000"; 
wait  for  100  ns; 
instr  <=  X"48F000"; 


- NOP 


- NOP 


- NOP 


-  BNEZ  RO, 


- NOP 


- NOP 


- NOP 


- NOP 


-  BNEZ  R2, 


- NOP 


- NOP 


- NOP 


- NOP 


-  JR  R15 


OxOF 


OxOF 
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wait  for  100  ns; 


instr  <=  X"000000"; 
wait  for  100  ns; 
instr  <=  X"000000"; 
wait  for  100  ns; 
instr  <=  X"000000"; 
wait  for  100  ns; 

instr  <=  X"000000"; 
wait  for  100  ns; 
instr  <=  X"68F000"; 
wait  for  100  ns; 
instr  <=  X"000000"; 
wait  for  100  ns; 
instr  <=  X"000000"; 
wait  for  100  ns; 
instr  <=  X"000000"; 
wait  for  100  ns; 

instr  <=  X"000000"; 
wait  for  100  ns; 
instr  <=  X"450F00"; 
wait  for  100  ns; 
instr  <=  X"28FF00"; 
wait  for  100  ns; 
instr  <=  X"000000"; 
wait  for  100  ns; 
instr  <=  X"000000"; 
wait  for  100  ns; 
instr  <=  X"000000"; 


- NOP 


- NOP 


- NOP 


- NOP 


-  JALR  R15 


- NOP 


- NOP 


- NOP 


- NOP 


-  SW  RO,  R15 


-  TRAP  FFOO 


- NOP 


- NOP 


- NOP 


wait  for  100  ns; 
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instr 
wait  for 
instr  <= 
wait  for 
instr  <= 
wait  for 
instr  <= 
wait  for 
instr  <= 
wait  for 
instr 


=  X"000000" 
100  ns; 
X"F80000"; 
100  ns; 
X"000000"; 
100  ns; 
X"000000"; 
100  ns; 
X"000000"; 
100  ns; 

=  X"000000" 


wait  for  100  ns; 
DATA  <=  X"FFF1"; 
instr  <=  X"440100"; 
wait  for  100  ns; 
instr  <=  X"000000"; 
wait  for  100  ns; 
instr  <=  X"000000"; 
wait  for  100  ns; 
instr  <=  X"000000"; 
wait  for  100  ns; 
instr  <=  X"000000"; 
wait  for  100  ns; 
instr  <=  X"000000"; 


- NOP 


- RFE 


- NOP 


- NOP 


- NOP 


- NOP 


- LW  RO  (0)  ,  R1 


- NOP 


- NOP 


- NOP 


- NOP 


- NOP 


wait  for  100  ns; 

DATA  <=  "ZZZZZZZZZZZZZZZZ"; 

instr  <=  X"000000";  -  NOP 

wait  for  100  ns; 
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instr  <=  X"450100"; 


SW  RO  (0)  ,  R1 


wait  for  100  ns; 

instr  <=  X"000000";  -  NOP 

wait  for  100  ns; 

instr  <=  X"000000";  -  NOP 

wait  for  100  ns; 

instr  <=  X"000000";  -  NOP 

wait  for  100  ns; 

instr  <=  X"000000";  -  NOP 

wait  for  100  ns; 

instr  <=  X"000000";  -  NOP 

wait  for  100  ns; 


end  process; 


--  Place  stimulus  and  analysis  statements  here 


dut  :  DLX  port  map  ( 

Instr  =>  Instr, 

Addr  int  =>  addr  int, 
PC  =>  PC, 

Data  =>  data, 

Resetn  =>  resetn. 

Prog  Rd  =>  prog  rd, 

Rd  =>  rd, 

Wr  =>  wr, 

Stalin  =>  Stalin, 
Clock  in  =>  clock  in 

)  ; 

end  testbench; 
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9. 


increment,  vhd 


LIBRARY  IEEE; 

USE  IEEE.std_logic_II64.all; 

USE  lEEE.std  logic  arith.all; 

__  *****  model  ***** 

--  external  ports 

ENTITY  dlx  IS  PORT  ( 

Addr  Int  :  OUT  std  logic  vector(15  downto  0); 
Clock  in  :  IN  std  logic; 

Data  :  INOUT  std_logic_vector ( 15  downto  0); 
Instr  :  IN  std  logic  vector (23  downto  0); 

PC  :  OUT  std_logic_vector ( 15  downto  0); 
Prog_Rd  :  OUT  std_logic; 

Rd  :  OUT  std_logic; 

Resetn  :  IN  std  logic; 

Stalin  :  IN  std  logic; 

Wr  :  OUT  std  logic 

)  ; 


END  dlx; 

--  internal  structure 

ARCHITECTURE  structural  OF  dlx  IS 

--  COMPONENTS 

COMPONENT  core 
PORT  ( 

Addr  Int  :  OUT  std  logic  vector(15  downto  0); 
Clock  in  :  IN  std  logic; 

Input  Data  :  IN  std  logic_vector ( 15  downto  0); 
Output_Data  :  Out  std_logic_vector ( 15  downto  0) 
Instr  :  IN  std  logic  vector (23  downto  0); 

PC  :  OUT  std_logic_vector ( 15  downto  0); 

Prog_Rd  :  OUT  std_logic; 

Rd  :  OUT  std_logic; 

Resetn  :  IN  std  logic; 

Stalin  :  IN  std  logic; 

Wr  :  OUT  std  logic 

)  ; 

END  COMPONENT; 


COMPONENT  IO_Pads 
PORT  ( 

Pads  :  INOUT  std_logic_vector  (15  downto  0); 
In_Data  :  OUT  std_logic_vector  (15  downto  0) ; 
Out_Data  :  IN  std_logic_vector  (15  downto  0) ; 
Output  En  n  :  IN  std  logic 

)  ; 

END  COMPONENT; 


--  SIGNALS 
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signal  Input  data  :  std  logic  vector(15  downto  0); 
signal  Output_data  :  std_logic_vector ( 15  downto  0); 
signal  wr  int  :  std  logic; 

--  INSTANCES 
BEGIN 

wr  <=  wr  int; 

Corel  :  core  PORT  MAP ( 

Addr  Int  =>  Addr  Int, 

Clock  in  =>  Clock  In, 

Input_Data  =>  Input  data, 

Output_Data  =>  Output_data, 

Instr  =>  Instr, 

PC  =>  PC, 

Prog  Rd  =>  Prog  Rd, 

Rd  =>  Rd,  ~ 

Resetn  =>  Resetn, 

Stalin  =>  Stalin, 

Wr  =>  Wr_int 

)  ; 

IO_Pads_l  :  IO_Pads  PORT  MAP ( 

Pads  =>  Data, 

In  Data  =>  Input_Data, 

Out_Data  =>  Output_Data, 

Output_En  n  =>  wr  int 

)  ; 

END  structural; 

10,  lOPads.vhd 

LIBRARY  IEEE; 

USE  IEEE. std_logic_1164. all; 

-  ***  io_Pads  Model  *** 

-  external  ports 

Entity  10  Pads  is  PORT  ( 

Pads  :  INOUT  std_logic_vector  (15  downto  0); 

In_Data  :  Out  std_logic_vector  (15  downto  0); 

Out_Data  :  In  std_logic_vector  (15  downto  0); 

Output  En  n  :  IN  std  logic 

)  ; 

END  IO_Pads; 

Architecture  Behavior  of  10  Pads  is 
Begin 

--In  Data  <=  Pads; 

Pads  <=  Out  Data  when  Output_En  n  =  'O'  else  (Pads 'range  => 

'  Z  '  )  ; 

In  Data  <=  Pads; 
end  Behavior; 
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11.  logbarrel.vhd 

LIBRARY  IEEE; 

USE  IEEE.std_logic_II64.all; 

—  *****  log_barrel  model  ***** 

--  external  ports 

ENTITY  log_barrel  IS  PORT  ( 

ar  or  log  :  IN  std  logic; 

In  word  :  IN  std  logic  vector(15  downto  0); 

1  or  r  :  IN  std  logic; 

Out_word  :  Out  std_logic_vector ( 15  downto  0) ; 
Shift:  IN  std  logic  vector (3  downto  0) 


END  log  barrel; 


0)  ; 
0)  ; 
0)  ; 


--  internal  structure 
ARCHITECTURE  rtl  OF  log_barrel  IS 

signal  sell,  sel2,  sel3,  sel4  :  std_logic_vector  (  1  downto  0) ; 


signal 

buf Ob, 

buf Oc, 

bufOd  :  std 

logic  vector 

(15  downto 

0)  ; 

signal 

buf la. 

buflb. 

buflc,  bufld 

:  std 

logic 

vector 

(15 

downto 

signal 

buf 2a, 

buf 2b, 

buf2c,  buf2d 

:  std 

logic 

vector 

(15 

downto 

signal 

buf 3a, 

buf 3b, 

buf3c,  buf3d 

:  std 

logic 

vector 

(15 

downto 

component  word_mux4 

port  (a  :  in  std_logic_vector  (15  downto  0); 

b  :  in  std_logic_vector  (15  downto  0); 

c  :  in  std_logic_vector  (15  downto  0); 

d  :  in  std_logic_vector  (15  downto  0); 

sel  :  in  std_logic_vector  (1  downto  0); 
out_word  :  out  std_logic_vector  (15  downto  0) 

)  ; 


end  component; 


begin 


sell 

(1) 

<= 

1 

or 

r  and 

shift  ( 0 ) 

r 

sell 

(0) 

<= 

ar 

or 

log 

and 

shift 

(0)  ; 

CSl 

1 — 1 
0) 
CO 

(1) 

<= 

1 

or 

r  and 

shift  ( 1 ) 

r 

CSl 

1 — 1 

0) 

CO 

(0) 

<= 

ar 

or 

log 

and 

shift 

(1) ; 

sel3 

(1) 

<= 

1 

or 

r  and 

shift  (2 ) 

r 

sel3 

(0) 

<= 

ar 

or 

log 

and 

shift 

(2)  ; 

sel4 

(1) 

<= 

1 

or 

r  and 

shift ( 3 ) 

r 

1 — 1 
0) 
CO 

(0) 

<= 

ar 

or 

log 

and 

shift 

(3)  ; 

bufOb  <=  in  word (14  downto  0)  &  "0"; 

bufOc  <=  "0"  &  in  word(15  downto  1); 
bufOd  <=  in  word(15)  &  in  word(15  downto  1); 

buflb  <=  bufla(13  downto  0)  &  "00"; 
buflc  <=  "00"  &  bufla(15  downto  2); 
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bufld 

<= 

bufla (15) 

1  &  bufla(15) 

&  bufla(15 

buf  2b 

<= 

buf2a (11 

downto  0)  & 

"0000"; 

buf  2c 

<= 

"0000"  & 

buf2a(15  downto  4); 

buf  2d 

<= 

buf2a (15) 

1  &  buf2a(15) 

&  buf2a(15 

downto  4 )  ; 


buf  3b 
buf  3c 
buf  3d 
buf3a(15)  & 


<=  buf3a(7  downto  0)  &  "00000000"; 

<=  "00000000"  &  buf3a(15  downto  8); 
<=  buf3a(15)  &  buf3a(15)  &  buf3a(15) 
buf3a(15)  &  buf3a(15)  &  buf3a(15)  & 


muxl :  word  mux4 

port  map  ( 

a  =>  in  word, 
b  =>  bufOb, 
c  =>  bufOc, 
d  =>  bufOd, 
sel  =>  sell, 
out  word  =>  bufla 
)  ; 

mux2 :  word_mux4 

port  map  ( 

a  =>  bufla, 
b  =>  buflb, 
c  =>  buflc, 
d  =>  bufld, 
sel  =>  sel2, 
out  word  =>  buf2a 
)  ; 


mux 3 :  word_mux4 
port  map  ( 

a  =>  buf2a, 
b  =>  buf2b, 
c  =>  buf2c, 
d  =>  buf2d, 
sel  =>  sel3, 
out_word  =>  buf3a 
)  ; 


mux 4 :  word_mux4 
port  map  ( 

a  =>  buf3a, 

b  =>  buf3b, 

c  =>  buf3c, 

d  =>  buf3d, 

sel  =>  sel4, 

out_word  =>  out_word) ; 

end  rtl; 


downto  2 ) ; 


&  buf2a(15)  &  buf2a(15 


&  buf3a(15)  & 
buf3a(15  downto  8); 
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12,  pccontroLvhd 

LIBRARY  IEEE; 

USE  IEEE.std_logic_II64.all; 

—  *****  pc_control  model  ***** 

--  external  ports 

ENTITY  pc_control  IS  PORT  ( 

ALU  Out  :  IN  std_logic  vector(15  downto  0); 
Clock  :  IN  std  logic; 

D2  Inc  PC  :  OUT  std  logic  vector (15  downto  0) 
D  Link  PC  :  OUT  std  logic  vector (15  downto  0) 
lAR  Enable  :  IN  std  logic; 

In_PC  :  OUT  std_logic_vector ( 15  downto  0); 

PC  :  OUT  std_logic_vector ( 15  downto  0); 

PC  Sel  :  IN  std  logic  vector(l  downto  0); 
Resetn  :  IN  std  logic; 

Scan  Data  In  :  IN  std_logic; 

Scan_Data_Out  :  OUT  std_logic; 

Scan  Enable  :  IN  std  logic; 

Stalin  :  IN  std  logic 

)  ; 

END  pc  control; 

--  internal  structure 

ARCHITECTURE  structural  OF  pc_control  IS 

--  COMPONENTS 

COMPONENT  word_reg_single 

PORT  ( 

Clock  :  IN  std  logic; 

Data  In  :  IN  std  logic  vector(15  downto  0); 
Data_out  :  OUT  std_logic_vector ( 15  downto  0) ; 
Enable  :  IN  std  logic; 

Resetn  :  IN  std  logic; 

Scan  Data  In  :  IN  std  logic; 

Scan  Enable  :  IN  std  logic 
) ;  ~ 

END  COMPONENT; 

COMPONENT  word_mux3 
PORT  ( 

A  :  IN  std  logic  vector(15  downto  0); 

B  :  IN  std  logic_vector ( 15  downto  0); 

C  :  IN  std_logic_vector ( 15  downto  0); 

Out_word  :  OUT  std_logic_vector ( 15  downto  0) ; 
Sel  :  IN  std  logic  vector (1  downto  0) 

)  ; 

END  COMPONENT; 

COMPONENT  increment 
PORT  ( 

Cl  :  IN  std_logic; 

In  word  :  IN  std_logic  vector(15  downto  0); 
Out_word  :  OUT  std_logic_vector ( 15  downto  0) 

)  ; 

END  COMPONENT; 
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SIGNALS 


SIGNAL  lAR  :  std  logic_vector ( 15  downto  0); 

SIGNAL  PC  Incr  :  std  logic  vector(15  downto  0); 

SIGNAL  But  In  PC  :  std  logic  vector(15  downto  0); 
SIGNAL  Buf  PC  :  std  logic  vector(15  downto  0); 

SIGNAL  Buf  Scan  Data  Out  :  std  logic; 

SIGNAL  Buf  DI  Inc  PC  :  std  logic  vector(15  downto  0); 
SIGNAL  Buf  D2  Inc  PC  :  std  logic  vector(15  downto  0); 
SIGNAL  Buf  D  Link  PC  :  std  logic  vector(15  downto  0); 
SIGNAL  Link  PC  :  std  logic  vector(15  downto  0); 

SIGNAL  Buf  Link  PC  :  std  logic  vector(15  downto  0); 


--  INSTANCES 

BEGIN 

In_PC  <=  Buf_In_PC; 

PC  <=  Buf_PC; 

D2_Inc_PC  <=  Buf_D2_Inc_PC; 

D_Link_PC  <=  Buf_D_Link_PC; 

Scan_Data_Out  <=  IAR(I5); 

halfword  reg  single  1  :  word  reg  single  PORT  MAP ( 
Clock  =>  Clock, 

Data  In  =>  Buf  In  PC, 

Data_out  =>  Buf_PC, 

Enable  =>  Stalin, 

Resetn  =>  Resetn, 

Scan  Data  In  =>  Scan  Data  In, 

Scan  Enable  =>  Scan  Enable 

)  ; 

halfword  mux3  1  :  word  inux3  PORT  MAP  ( 

A  =>  PC  Incr, 

B  =>  ALU_Out, 

C  =>  lAR, 

Out  word  =>  Buf  In  PC, 

Sel~=>  PC_Sel 

)  ; 

halfword  increment  1  :  increment  PORT  MAP ( 

CI~=>  '1', 

In  word  =>  Buf  PC, 

Out  word  =>  PC  Incr 

)  ; 

halfword  reg  single  2  :  word  reg  single  PORT  MAP ( 
Clock  =>  Clock, 

Data  In  =>  PC  Incr, 

Data  out  =>  Buf  DI  Inc  PC, 

Enable  =>  Stalin, 

Resetn  =>  Resetn, 

Scan  Data  In  =>  Buf  PC  (15), 

Scan  Enable  =>  Scan  Enable 

)  ; 

halfword  reg  single_3  :  word  reg  single  PORT  MAP  ( 
Clock  =>  Clock, 

Data_In  =>  Buf_Dl_Inc_PC, 

Data  out  =>  Buf  D2  Inc  PC, 

Enable  =>  Stalin, 
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Resetn  =>  Resetn, 

Scan  Data  In  =>  Buf  D1  Inc  PC (15), 

Scan_Enable  =>  Scan  Enable 

)  ; 

halfword  increment  2  :  increment  PORT  MAP ( 

CI~=>  ' 1 ' , 

In  word ( 0 )  =>  ' 1 '  , 

In  word (15  downto  1)  =>  Buf  D2  Inc  PC (15  downto  1), 
Out  word (15  downto  0)  =>  Link  PC (15  downto  0) 

) ;  ~ 

halfword  reg  single  4  :  word  reg  single  PORT  MAP ( 

Clock  =>  Clock, 

Data_ln(0)  =>  Buf  D2_Inc_PC ( 0 ) , 

Data  In (15  downto  1)  =>  Link  PC (15  downto  1), 

Data  out  =>  Buf  Link  PC, 

Enable  =>  Stalin, 

Resetn  =>  Resetn, 

Scan_Data_In  =>  Buf  D2  Inc  PC(15), 

Scan  Enable  =>  Scan  Enable 

)  ; 

halfword  reg  single  5  :  word  reg  single  PORT  MAP  ( 

Clock  =>  Clock, 

Data_In  =>  Buf  Link  PC, 

Data  Out  =>  Buf  D  Link  PC, 

Enable  =>  Stalin, 

Resetn  =>  Resetn, 

Scan  Data  In  =>  Buf  Link  PC (15), 

Scan  Enable  =>  Scan  Enable 

)  ; 

halfword  reg  single  6  :  word  reg  single  PORT  MAP ( 

Clock  =>  Clock, 

Data  In  =>  Buf  D  Link  PC, 

Data_out  =>  lAR, 

Enable  =>  lAR  Enable, 

Resetn  =>  Resetn, 

Scan  Data  In  =>  Buf  D  Link  PC (15), 

Scan  Enable  =>  Scan  Enable 

)  ; 

END  structural; 

13,  pipeline.vhd 

LIBRARY  IEEE; 

USE  IEEE. std_logic_II64. all; 

—  *****  pipeline  model  ***** 

--  external  ports 
ENTITY  pipeline  IS  PORT  ( 

alu_op  :  OUT  std_logic_vector ( 4  downto  0); 

A  Mux  :  OUT  std  logic  vector (1  downto  0); 

B  Mux  :  OUT  std  logic  vector (1  downto  0); 

Clock  :  IN  std  logic; 

Data_In  :  IN  std  logic  vector (23  downto  0); 

Dest  :  OUT  std_logic_vector ( 3  downto  0); 

Immed  :  OUT  std  logic  vector(15  downto  0); 

PC_Sel  :  OUT  std_logic_vector ( 1  downto  0); 
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rd_enable  :  OUT  std_logic; 

Reg  In  Sel  :  OUT  std  logic  vector (1  downto  0) 
Resetn  :  IN  std  logic; 

RSone  :  OUT  std_logic_vector ( 3  downto  0); 
RStwo  :  OUT  std_logic_vector ( 3  downto  0); 

Scan  Data  In  :  IN  std  logic; 

Scan  Enable  :  IN  std  logic; 

Stalin  :  IN  std  logic; 
wb  enable  :  OUT  std  logic; 
scan_out  :  OUT  std_logic; 
lAR  Enable  :  OUT  std  logic; 
wr  enable  :  OUT  std  logic; 
zero_flag  :  IN  std  logic 

)  ; 

END  pipeline; 

--  internal  structure 
ARCHITECTURE  rtl  OF  pipeline  IS 

--  COMPONENTS 

COMPONENT  twelve  bit  reg  single 
PORT  (  -  -  - 

Clock  :  IN  std  logic; 

Data  In  :  IN  std  logic  vector(ll  downto  0); 
Data_out  :  OUT  std_logic_vector ( 1 1  downto  0) ; 
Enable  :  IN  std  logic; 

Resetn  :  IN  std  logic; 

Scan  Data  In  :  IN  std  logic; 

Scan  Enable  :  IN  std  logic 
) ;  ~ 

END  COMPONENT; 

COMPONENT  twenty  four  bit  reg  single 
PORT  ( 

Clock  :  IN  std  logic; 

Data  In  :  IN  std  logic  vector  (23  downto  0); 
Data_out  :  OUT  std_logic_vector (23  downto  0) ; 
Enable  :  IN  std  logic; 

Resetn  :  IN  std  logic; 

Scan  Data  In  :  IN  std  logic; 

Scan  Enable  :  IN  std  logic 
) ;  ~ 

END  COMPONENT; 


--  SIGNALS 

SIGNAL  Dec_Instr  :  std  logic  vector  (23  downto  0); 
SIGNAL  Ex  Instr  :  std  logic  vector  (23  downto  0); 
SIGNAL  Mem  Instr  :  std  logic  vector  (11  downto  0); 
SIGNAL  WB  Instr  :  std  logic  vector  (11  downto  0); 


--  INSTANCES 
BEGIN 


******  decode  pipeline  stage 
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'k'k'k'k'k'k'k'k'k 


PORT  MAP ( 


twenty  bit  reg  single  1  :  twenty  four  bit  reg  single 
Clock  =>  Clock, 

Data  In  =>  Data  In, 

Data  out  =>  Dec  Instr, 

Enable  =>  Stalin, 

Resetn  =>  Resetn, 

Scan  Data_In  =>  Scan  Data  In, 

Scan  Enable  =>  Scan  Enable 

)  ; 


process  (Dec_Instr) 
begin 

RSone  <=  Dec  Instr(15  downto  12); 

-  assign  RS2  (check  for  SW  instruction) 

if  (Dec  Instr (23  downto  16)  =  X"45")  then 
RStwo  <=  Dec  Instr (11  downto  8)  ; 

else  RStwo  <=  Dec  Instr(7  downto  4); 
end  if; 
end  process; 

-  ******  execute  pipeline  stage  ********** 

twenty  four  bit  reg  single  2  :  twenty  four  bit  reg  single  PORT 

MAP  ( 

Clock  =>  Clock, 

Data  In  =>  Dec  Instr, 

Data  out  =>  Ex  Instr, 

Enable  =>  Stalin, 

Resetn  =>  Resetn, 

Scan  Data  In  =>  Dec  Instr (23), 

Scan  Enable  =>  Scan  Enable 

)  ; 


Immed  <=  Ex  Instr(15  downto  0);  -  assign  immediate  value 

alu  op  <=  Ex  Instr (20  downto  16);  -  assign  alu  opcodes 

b  mux  <=  Ex  Instr(22  downto  21);  -  assign  b  mux 


PC  Sel 

A 

O 

when 

Ex  Instr  (23 

downto 

16)  =  X"C8" 

else 

— 

when 

OP  J 

"01" 

when 

Ex  Instr (23 

downto 

16)  =  X"E8" 

else 

when 

OP  JAL 

"0" 

&  zero 

flag  when 

Ex  Instr (23  downto 

16)  = 

X"C1" 

else 

- when 

OP  BEQZ 

"0" 

&  not ( zero  flag) 

when  Ex 

Instr (23  downto 

16)  = 

X"C0" 

else  -■ 

--when  OP  BEQZ 

"10" 

when 

Ex  Instr  (23 

downto 

16)  =  X"F8" 

else 

- OP 

"01" 

when 

Ex  Instr (23 

downto 

16)  =  X"28" 

else 

— 

OP  TRAP 

"01" 

when 

Ex  Instr  (23 

downto 

16)  =  X"48" 

else 

— 

OP  JR 

"01" 

when 

Ex  Instr  (23 

downto 

16)  =  X"68" 

else 

— 

OP  JALR 

"00" 

r 

process  (Ex  Instr) 
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begin 


case  Ex  Instr(23  downto  16)  is 


when 

>< 

o 

CO 

"  => 

- when 

OP 

J 

A 

Mux 

\ — 1 

1 — 1 

V 

when  X"E8"  => 

- when 

OP 

JAL 

A 

Mux 

\ — 1 

1 — 1 

V 

when 

X"C1 

"  => 

- when 

OP 

BEQZ 

A 

Mux 

\ — 1 
o 

V 

when 

X"C0 

"  => 

- when 

OP 

BNEZ 

A 

Mux 

\ — 1 
o 

V 

when 

X"08 

"  => 

- when 

OP 

LHI 

A 

Mux 

o 
( — 1 

V 

when 

X"F8 

"  => 

- when 

OP 

REE 

A 

Mux 

A 

o 

o 

when 

X"28 

"  => 

- when 

OP 

TRAP 

A 

Mux 

\ — 1 

1 — 1 

V 

when 

X"48 

"  => 

- when 

OP 

JR 

A 

Mux 

A 

O 

O 

when 

>< 

CO 

"  => 

- when 

OP 

JALR 

A 

Mux 

A 

O 

o 

when 

others  => 

-  OTHERS 

A 

Mux 

A 

O 

o 

end  case; 
end  process; 

-  *****  memory  stage  of  pipeline  *******  - 

twelve  bit  reg  single  1  :  twelve  bit  reg  single  PORT  MAP  ( 
Clock  =>  Clock, 

Data  In (11  downto  4)  =>  Ex  Instr(23  downto  16), 

Data  In (3  downto  0)  =>  Ex  Instr(Il  downto  8), 

Data_out  =>  Mem  Instr, 

Enable  =>  Stalin, 

Resetn  =>  Resetn, 

Scan  Data  In  =>  Ex  Instr (23), 

Scan  Enable  =>  Scan  Enable 


process  (Mem  Instr) 
begin 

case  Mem  Instr (II  downto 
when  X"45"  => 

rd_enable  <=  'O'; 
wr  enable  <=  ' 1 ' ; 
when  X"44"  => 

rd  enable  <=  ' 1 '  ; 
wr  enable  <=  'O'; 
when  others  => 

rd_enable  <=  'O'; 
wr  enable  <=  'O'; 
end  case; 
end  process; 


4 )  is 


OP  SW  (write) 
OP  LW  (read) 


-  ********  write  back  stage  ******** 

twelve  bit  reg  single  2  :  twelve  bit  reg  single  PORT  MAP ( 
Clock  =>  Clock, 
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Data  In  =>  Mem  Instr, 

Data  out  =>  WB  Instr, 

Enable  =>  Stalin, 

Resetn  =>  Resetn, 

Scan  Data  In  =>  Mem  Instr(ll), 
Scan  Enable  =>  Scan  Enable 


scan  out  <=  WB  Instr (11); 
process  (WB  Instr) 
begin 


-  check  for  Jump  and  Link  Instructions  to  set  Reg  In  Sel(O) 

0 

if  (WB  Instr (11  downto  4)  =  X"E8"  or  WB  Instr (11  downto  4)  = 
X"68")  then 

Reg  In  Sel(l)  <=  ' 1 ' ; 

Dest  <=  "1111"; 
else  Reg_In_Sel ( 1 )  <=  'O'; 

Dest  <=  WB  Instr  (3  downto  0); 
end  if; 

-  check  for  TRAP  to  set  IAR_Enable  =  1 

if  (WB  Instr (11  downto  4)  =  X"28")  then 
IAR_Enable  <=  ' 1 ' ; 
else  lAR  Enable  <=  'O'; 
end  if; 

-  check  for  LW  to  set  Reg  In  Sel(l)  =  1 

if  (WB  Instr (11  downto  4)  =  X"44"  )  then 
Reg_In_Sel ( 0 )  <=  '1'; 
else  Reg_In_Sel ( 0 )  <=  'O'; 
end  if; 


-  set  write  back  enable 

case  WB  Instr (11  downto  4) 
when  X"C8"  => 

WB  Enable  <=  'O'; 
when  X"C1"  => 

WB  Enable  <=  'O'; 
when  X"C0"  => 

WB  Enable  <=  'O'; 
when  X"45"  => 

WB  Enable  <=  'O'; 
when  X"F8"  => 

WB  Enable  <=  'O'; 
when  X"28"  => 

WB  Enable  <=  'O'; 
when  X"48"  => 

WB  Enable  <=  'O'; 
when  X"00"  => 

WB  Enable  <=  'O'; 
when  others  => 

WB_Enable  <=  ' 1 ' ; 
end  case; 
end  process; 

END  rtl; 


-  when  OP  J 

when  OP  BEQZ 
when  OP  BNEZ 
when  OP  SW 
when  OP  REE 
when  OP_TRAP 
when  OP  JR 
when  OP  NOP 
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14,  regfile.vhd 

LIBRARY  IEEE; 

USE  IEEE.std_logic_II64.all; 

- *******  regfile  model  *********** 

-  external  ports 

ENTITY  regfile  IS  PORT  ( 

A  :  OUT  std_logic_vector ( 15  downto  0); 

B  :  OUT  std_logic_vector ( 15  downto  0); 
clock  :  IN  std  logic; 

Data  In  :  IN  std  logic  vector(15  downto  0); 

Best  :  IN  std_logic_vector ( 3  downto  0); 

Stalin  :  IN  std  logic; 

RSone  :  IN  std  logic  vector (3  downto  0); 

RStwo  :  IN  std  logic  vector (3  downto  0); 
scan  data  in  :  IN  std  logic; 
scan_enable  :  IN  std_logic; 

Resetn  :  IN  std  logic; 
wb  enable  :  IN  std  logic 
) ;  ~ 

END  regfile; 

-  internal  structure 

ARCHITECTURE  structural  OF  regfile  is 

-  COMPONENTS 

COMPONENT  Dest_Decoder 
PORT  ( 

Best  :  IN  std_logic_vector ( 3  downto  0); 

Enable  :  OUT  std  logic  vector(15  downto  1); 
wb  enable  :  IN  std  logic 
) ;  ~ 

END  COMPONENT; 

COMPONENT  word  reg  single 
PORT  (  ~  ~ 

Clock  :  IN  std  logic; 

Data_In  :  IN  std_logic  vector  (15  downto  0); 
Data_out  :  OUT  std_logic_vector  (15  downto  0) 
enable  :  IN  std  logic; 

Resetn  :  IN  std  logic; 

Scan  Data  In  :  IN  std  logic; 

Scan  Enable  :  IN  std  logic 
) ;  ~ 

END  COMPONENT; 


COMPONENT  word_muxl6 
PORT  ( 


In  WordO 
In  WordI 
In  Word2 
In  Word3 
In  Word4 
In  Word5 
In  Words 
In  Word7 


IN  std 
IN  std 
IN  std 
IN  std 
IN  std 
IN  std 
IN  std 
IN  std’ 


logic_vector  (15 
logic_vector (15 
logic_vector (15 
logic_vector  (15 
logic_vector (15 
logic_vector  (15 
logic_vector (15 
logic_vector (15 


downto 

0) 

r 

downto 

0) 

r 

downto 

0) 

r 

downto 

0) 

r 

downto 

0) 

r 

downto 

0) 

r 

downto 

0) 

r 

downto 

0) 

r 
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In  Words  :  IN  std  logic  vector(I5  downto  0); 
In  Word9  :  IN  std  logic  vector(15  downto  0); 
In  WordIO  :  IN  std  logic_vector ( 15  downto  0)  ; 
In  Wordll  :  IN  std  logic  vector(15  downto  0); 

In  Wordl2  :  IN  std  logic  vector(15  downto  0); 

In  WordlS  :  IN  std  logic  vector(15  downto  0) ; 

In  Wordl4  :  IN  std  logic  vector(15  downto  0); 

In  WordI5  :  IN  std  logic  vector(15  downto  0) ; 

Out_word  :  Out  std_logic_vector ( 15  downto  0) ; 
Sel  :  IN  std_logic_vector ( 3  downto  0) 

)  ; 

END  component; 

-  signals 

signal  Enable  :  std  logic  vector(15  downto  1); 
signal  Regl  Data  :  std  logic  vector(15  downto  0); 

signal  Reg2  Data  :  std  logic  vector(15  downto  0); 

signal  Reg3  Data  :  std_logic  vector(15  downto  0); 
signal  Reg4  Data  :  std  logic  vector(15  downto  0); 

signal  Reg5  Data  :  std  logic  vector(15  downto  0); 

signal  Reg6  Data  :  std  logic  vector(15  downto  0); 

signal  Reg7  Data  :  std  logic  vector(15  downto  0); 

signal  RegS  Data  :  std_logic  vector(15  downto  0); 

signal  Reg9  Data  :  std  logic  vector(15  downto  0); 

signal  ReglO  Data  :  std  logic  vector(15  downto  0); 

signal  Regll  Data  :  std  logic  vector(15  downto  0); 

signal  Regl2  Data  :  std  logic  vector(15  downto  0); 

signal  Regl3  Data  :  std_logic  vector(15  downto  0); 
signal  Regl4  Data  :  std  logic  vector(15  downto  0); 

signal  Regl5  Data  :  std  logic  vector(15  downto  0); 

signal  RegA  Data  :  std  logic  vector(15  downto  0); 

signal  MuxA  Data  :  std  logic  vector(15  downto  0); 

signal  MuxB  Data  :  std_logic  vector(15  downto  0); 
signal  zero  word  :  std  logic  vector(15  downto  0); 


begin 

zero_word  <=  "0000000000000000"; 

-  port  maps 

Dest  Decoderl  :  Dest  Decoder  PORT  MAP  ( 
Dest=>  Dest, 

Enable  =>  Enable, 
wb  enable  =>  wb  enable 
) ;  ~ 

word  regl  :  word  reg  single  PORT  MAP  ( 

Clock  =>  clock. 

Data  In  =>  Data  In, 

Data_out  =>  RegI_Data, 

Enable  =>  Enable (1), 

Resetn  =>  Resetn, 

Scan  Data  In  =>  Scan  Data  In, 

Scan  Enable  =>  Scan  Enable 

)  ; 
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word  reg2  :  word  reg  single  PORT  MAP  ( 
Clock  =>  clock, 

Data_In  =>  Data  In, 

Data  out  =>  Reg2  Data, 

Enable  =>  Enable (2), 

Resetn  =>  Resetn, 

Scan  Data  In  =>  Regl  Data(15), 
Scan  Enable  =>  Scan  Enable 

)  ; 

word  reg3  :  word  reg  single  PORT  MAP  ( 
Clock  =>  clock. 

Data  In  =>  Data  In, 

Data_out  =>  Reg3_Data, 

Enable  =>  Enable (3), 

Resetn  =>  Resetn, 

Scan  Data  In  =>  Reg2  Data(15), 
Scan  Enable  =>  Scan  Enable 

)  ; 

word  reg4  :  word  reg  single  PORT  MAP  ( 
Clock  =>  clock. 

Data  In  =>  Data  In, 

Data_out  =>  Reg4_Data, 

Enable  =>  Enable (4), 

Resetn  =>  Resetn, 

Scan  Data  In  =>  Reg3  Data(15), 
Scan  Enable  =>  Scan  Enable 

)  ; 

word  regS  :  word  reg  single  PORT  MAP  ( 
Clock  =>  clock. 

Data  In  =>  Data  In, 

Data_out  =>  Reg5_Data, 

Enable  =>  Enable (5), 

Resetn  =>  Resetn, 

Scan  Data  In  =>  Reg4  Data(15), 
Scan  Enable  =>  Scan  Enable 

)  ; 

word  reg6  :  word  reg  single  PORT  MAP  ( 
Clock  =>  clock. 

Data  In  =>  Data  In, 

Data_out  =>  Reg6_Data, 

Enable  =>  Enable (6), 

Resetn  =>  Resetn, 

Scan  Data  In  =>  RegS  Data(15), 
Scan  Enable  =>  Scan  Enable 

)  ; 

word  reg7  :  word  reg  single  PORT  MAP  ( 
Clock  =>  clock, 

Data_In  =>  Data  In, 

Data_out  =>  Reg7_Data, 

Enable  =>  Enable (7), 

Resetn  =>  Resetn, 

Scan  Data  In  =>  Reg6  Data(15), 
Scan_Enable  =>  Scan  Enable 

)  ; 

word  regS  :  word  reg  single  PORT  MAP  ( 
Clock  =>  clock. 

Data  In  =>  Data  In, 
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Data_out  =>  Reg8_Data, 

Enable  =>  Enable (8), 

Resetn  =>  Resetn, 

Scan  Data  In  =>  Reg7  Data(15), 
Scan  Enable  =>  Scan  Enable 

)  ; 

word  reg9  :  word  reg  single  PORT  MAP  ( 
Clock  =>  clock. 

Data  In  =>  Data  In, 

Data_out  =>  Reg9_Data, 

Enable  =>  Enable (9), 

Resetn  =>  Resetn, 

Scan  Data  In  =>  Reg8_Data (15)  , 
Scan  Enable  =>  Scan  Enable 

)  ; 

word  reglO  :  word  reg  single  PORT  MAP  ( 
Clock  =>  clock, 

Data_In  =>  Data  In, 

Data_out  =>  RegIO_Data, 

Enable  =>  Enable (10), 

Resetn  =>  Resetn, 

Scan_Data_In  =>  Reg9_Data (15) , 
Scan  Enable  =>  Scan  Enable 

)  ; 

word  regll  :  word  reg  single  PORT  MAP  ( 
Clock  =>  clock. 

Data  In  =>  Data  In, 

Data  out  =>  Regll  Data, 

Enable  =>  Enable (11), 

Resetn  =>  Resetn, 

Scan  Data  In  =>  ReglO  Data(15), 
Scan  Enable  =>  Scan  Enable 

)  ; 

word  regl2  :  word  reg  single  PORT  MAP  ( 
Clock  =>  clock. 

Data  In  =>  Data  In, 

Data  out  =>  Regl2  Data, 

Enable  =>  Enable (12), 

Resetn  =>  Resetn, 

Scan  Data  In  =>  Regll  Data(15), 
Scan  Enable  =>  Scan  Enable 

)  ; 

word  regl3  :  word  reg  single  PORT  MAP  ( 
Clock  =>  clock. 

Data  In  =>  Data  In, 

Data_out  =>  RegI3_Data, 

Enable  =>  Enable (13), 

Resetn  =>  Resetn, 

Scan  Data_In  =>  Regl2  Data(15), 
Scan  Enable  =>  Scan  Enable 

)  ; 

word  regl4  :  word  reg  single  PORT  MAP  ( 
Clock  =>  clock, 

Data_In  =>  Data  In, 

Data  out  =>  Regl4  Data, 

Enable  =>  Enable (14), 

Resetn  =>  Resetn, 
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Scan  Data  In  =>  Regl3  Data(15), 
Scan  Enable  =>  Scan  Enable 

)  ; 

word  reglS  :  word  reg  single  PORT  MAP  ( 
Clock  =>  clock. 

Data  In  =>  Data  In, 

Data_out  =>  RegI5_Data, 

Enable  =>  Enable (15), 

Resetn  =>  Resetn, 

Scan  Data  In  =>  Regl4  Data(15), 
Scan  Enable  =>  Scan  Enable 

)  ; 

word  regA  :  word  reg  single  PORT  MAP  ( 
Clock  =>  clock. 

Data  In  =>  MuxA  Data, 

Data  out  =>  RegA  Data, 

Enable  =>  stalln, 

Resetn  =>  Resetn, 

Scan  Data  In  =>  Regl5  Data(15), 
Scan  Enable  =>  Scan  Enable 


A  <=  RegA  Data; 

word  regB  :  word  reg  single  PORT  MAP  ( 
Clock  =>  clock. 

Data  In  =>  MuxB  Data, 

Data_out  =>  B, 

Enable  =>  stalln, 

Resetn  =>  Resetn, 

Scan  Data  In  =>  RegA  Data(15), 
Scan  Enable  =>  Scan  Enable 

)  ; 

MuxA  :  word  muxl6  PORT  MAP  ( 

In  WordO  =>  zero  word. 


In  Wordl 

=> 

Regl  Data, 

In  Word2 

=> 

Reg2  Data, 

In  Words 

=> 

Reg3  Data, 

In  Word4 

=> 

Reg4  Data, 

In  Words 

=> 

Reg5  Data, 

In  Word6 

=> 

Reg6  Data, 

In  Word7 

=> 

Reg7  Data, 

In  Words 

=> 

RegS  Data, 

In  Word9 

=> 

Reg9  Data, 

In  WordlO 

=> 

ReglO  Data, 

In  Wordll 

=> 

Regll  Data, 

In  Wordl2 

=> 

Regl2  Data, 

In  WordlS 

=> 

Regl3  Data, 

In  Wordl4 

=> 

Regl4  Data, 

In  Wordl5 

=> 

Regis  Data, 

Out  word  => 

Sel  =>  RSone 

MuxA  Data, 

)  ; 

MuxB  :  word_muxl6  PORT  MAP  ( 

In  WordO  =>  zero  word. 

In  Wordl  =>  Regl  Data, 
In  Word2  =>  Reg2  Data, 
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In  Words 

=> 

RegS  Data, 

In  Word4 

=> 

Reg4  Data, 

In  Words 

=> 

RegS  Data, 

In  Words 

=> 

RegS  Data, 

In  Words 

=> 

RegS  Data, 

In  Words 

=> 

RegS  Data, 

In  Word9 

=> 

Reg9  Data, 

In  WordIO 

=> 

ReglO  Data, 

In  Wordll 

=> 

Regll  Data, 

In  Wordl2 

=> 

Regl2  Data, 

In  WordlS 

=> 

Regis  Data, 

In  Wordl4 

=> 

Regl4  Data, 

In  WordlS 

=> 

Regis  Data, 

Out  word  => 

Sel  =>  RStwo 

MuxB  Data, 

)  ; 

END  structural; 


15,  rwcontrol.vhd 

LIBRARY  IEEE; 

USE  IEEE. std_logic_II64. all; 

—  *****  rw_control  model  ***** 

--  external  ports 

ENTITY  rw_control  IS  PORT  ( 

Clock  :  IN  std  logic; 

Prog_Rd  :  OUT  std_logic; 

Rd  :  OUT  std_logic; 
rd  enable  :  IN  std  logic; 
resetn  :  IN  std  logic; 

Stalin  :  IN  std  logic; 

Wr  :  OUT  std_logic; 
wr  enable  :  IN  std  logic 
) ;  ~ 

END  rw  control; 

--  internal  structure 
ARCHITECTURE  rtl  OF  rw_control  IS 

--  SIGNALS 

SIGNAL  clockn  :  std  logic;  -  inverted  clock 

BEGIN 

clockn  <=  not (Clock)  ; 

Wr  <=  not  (clockn  and  wr  enable) ; 

Rd  <=  not  (clockn  and  rd  enable) ; 

Prog  Rd  <=  not  (clockn  and  resetn  and  stalln) 
end  rtl; 
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16,  scanreg.vhd 

LIBRARY  IEEE; 

USE  IEEE.std_logic_II64.all; 

—  *****  scan_reg  model  ***** 

--  external  ports 

ENTITY  scan_reg  IS  PORT  ( 
elk  :  IN  std  logic; 
data  in  :  IN  std  logic; 
data_out  :  OUT  std_logic; 
enable  :  IN  std  logic; 
resetn  :  IN  std  logic; 
scan  data  in  :  IN  std  logic; 
scan  enable  :  IN  std  logic 
) ;  ~ 

END  scan  reg; 

--  internal  structure 

ARCHITECTURE  rtl  OF  scan_reg  IS 

--  INSTANCES 
BEGIN 

process  (elk,  resetn) 
begin 

if  (resetn  =  '0')  then 
data_out  <=  'O'; 

elsif  (elk  =  '1'  and  elk' event)  then 
if  (scan  enable  =  '1')  then 
data_out  <=  scan_data_in; 
elsif  (enable  =  '1')  then 
data_out  <=  data_in; 
end  if; 
end  if; 
end  process; 

END  rtl; 

17,  twelve_bit_reg_single,vhd 

LIBRARY  IEEE; 

USE  IEEE. std_logic_II64. all; 

--  *****  twelve  bit  reg  single  model  ***** 

--  external  ports 

ENTITY  twelve_bit_reg_single  IS  PORT  ( 

Clock  :  IN  std  logic; 

Data  In  :  IN  std  logic  vector(ll  downto  0); 
Data_out  :  OUT  std_logic_vector ( 1 1  downto  0) ; 
Enable  :  IN  std_logic; 

Resetn  :  IN  std  logic; 

Scan  Data  In  :  IN  std  logic; 

Scan  Enable  :  IN  std  logic 
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)  ; 

END  twelve  bit  reg  single; 

--  internal  structure 

ARCHITECTURE  structural  OF  twelve  bit  reg  single  IS 

--  COMPONENTS 
COMPONENT  scan_reg 
PORT  ( 

elk  :  IN  std  logic; 
data  in  :  IN  std  logic; 
data_out  :  OUT  std_logic; 
enable  :  IN  std  logic; 
resetn  :  IN  std  logic; 
scan  data  in  :  IN  std  logic; 
scan  enable  :  IN  std  logic 
) ;  ~ 

END  COMPONENT; 

--  SIGNALS 

signal  buf_data_out  :  std_logic_vector  (10  downto  0) 


--  INSTANCES 
BEGIN 


Data 

out ( 0 ) 

<= 

buf 

data 

out  ( 0 )  ; 

Data 

out ( 1 ) 

<= 

buf 

data 

out  ( 1 )  ; 

Data 

out (2 ) 

<= 

buf 

data 

out  (2 ) ; 

Data 

out ( 3 ) 

<= 

buf 

data 

out  ( 3 )  ; 

Data 

out ( 4 ) 

<= 

buf 

data 

out  ( 4 )  ; 

Data 

out ( 5 ) 

<= 

buf 

data 

out  ( 5 )  ; 

Data 

out (6) 

<= 

buf 

data 

out  ( 6) ; 

Data 

out ( 7 ) 

<= 

buf 

data 

out  ( 7 )  ; 

Data 

out ( 8 ) 

<= 

buf 

data 

out  ( 8 )  ; 

Data 

out ( 9 ) 

<= 

buf 

data 

out  ( 9 )  ; 

Data_out(I0)  <=  buf_data_out ( 1 0 ) ; 

scan  reg  I  :  scan  reg  PORT  MAP ( 
elk  =>  Clock, 
data_in  =>  Data_In(l), 
data_out  =>  buf_data_out ( 1 )  , 
enable  =>  Enable, 
resetn  =>  Resetn, 

scan_data_in  =>  buf_data_out ( 0 ) , 
scan  enable  =>  Scan  Enable 

)  ; 

scan  reg  2  :  scan  reg  PORT  MAP ( 
elk  =>  Clock, 
data  in  =>  Data  In (2), 
data_out  =>  buf_data_out (2 )  , 
enable  =>  Enable, 
resetn  =>  Resetn, 

scan_data_in  =>  buf_data_out ( 1 ) , 
scan  enable  =>  Scan  Enable 

)  ; 

scan  reg  3  :  scan  reg  PORT  MAP ( 
elk  =>  Clock, 
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data_in  =>  Data_In(3), 
data_out  =>  buf_data_out ( 3 ) , 
enable  =>  Enable, 
resetn  =>  Resetn, 

scan_data_in  =>  buf_data_out (2 ) , 
scan  enable  =>  Scan  Enable 

)  ; 

scan  reg  4  :  scan  reg  PORT  MAP ( 
elk  =>  Clock, 
data_in  =>  Data_In(4), 
data_out  =>  buf_data_out ( 4 ) , 
enable  =>  Enable, 
resetn  =>  Resetn, 

scan_data_in  =>  buf_data_out ( 3 ) , 
scan  enable  =>  Scan  Enable 

)  ; 

scan  reg  5  :  scan  reg  PORT  MAP ( 
elk  =>  Clock, 
data_in  =>  Data_ln(0), 
data_out  =>  buf_data_out ( 0 ) , 
enable  =>  Enable, 
resetn  =>  Resetn, 
scan  data  in  =>  Scan  Data  In, 
scan  enable  =>  Scan  Enable 

)  ; 

scan  reg  6  :  scan  reg  PORT  MAP ( 
elk  =>  Clock, 
data_in  =>  Data_In(5), 
data_out  =>  buf_data_out ( 5 ) , 
enable  =>  Enable, 
resetn  =>  Resetn, 

scan_data_in  =>  buf_data_out ( 4 ) , 
scan  enable  =>  Scan  Enable 

)  ; 

scan  reg  7  :  scan  reg  PORT  MAP ( 
elk  =>  Clock, 
data_in  =>  Data_In(6), 
data_out  =>  buf_data_out ( 6) , 
enable  =>  Enable, 
resetn  =>  Resetn, 

scan_data_in  =>  buf_data_out ( 5 ) , 
scan  enable  =>  Scan  Enable 

)  ; 

scan  reg  8  :  scan  reg  PORT  MAP ( 
elk  =>  Clock, 
data_in  =>  Data_In(7), 
data_out  =>  buf_data_out ( 7 ) , 
enable  =>  Enable, 
resetn  =>  Resetn, 

scan_data_in  =>  buf_data_out ( 6) , 
scan  enable  =>  Scan  Enable 

)  ; 

scan  reg  9  :  scan  reg  PORT  MAP ( 
elk  =>  Clock, 
data_in  =>  Data_In(8), 
data_out  =>  buf_data_out ( 8 ) , 
enable  =>  Enable, 
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resetn  =>  Resetn, 

scan_data_in  =>  buf_data_out ( 7 ) , 
scan_enable  =>  Scan  Enable 

)  ; 

scan  reg  10  :  scan  reg  PORT  MAP ( 
elk  =>  Clock, 
data_in  =>  Data_In(9), 
data_out  =>  buf_data_out ( 9 ) , 
enable  =>  Enable, 
resetn  =>  Resetn, 

scan_data_in  =>  buf_data_out ( 8 )  , 
scan  enable  =>  Scan  Enable 

)  ; 

scan  reg  11  :  scan  reg  PORT  MAP ( 
elk  =>  Clock, 
data_in  =>  Data_In(10), 
data_out  =>  buf_data_out (10) , 
enable  =>  Enable, 
resetn  =>  Resetn, 

scan_data_in  =>  buf_data_out ( 9 )  , 
scan  enable  =>  Scan  Enable 

)  ; 


scan  reg  12  :  scan  reg  PORT  MAP ( 
elk  =>  Clock, 
data  in  =>  Data  In(ll), 
data_out  =>  Data_out ( 1 1 ) , 
enable  =>  Enable, 
resetn  =>  Resetn, 

scan_data_in  =>  buf_data_out (10) , 
scan  enable  =>  Scan  Enable 

)  ; 

END  structural; 


18,  twentyfourbitregsingle.vhd 

LIBRARY  IEEE; 

USE  IEEE. std_logic_II64. all; 

__  *****  twenty  four  bit  reg  single  model  ***** 

--  external  ports 

ENTITY  twenty_four_bit_reg_single  IS  PORT  ( 

Clock  :  IN  std  logic; 

Data  In  :  IN  std  logic  vector  (23  downto  0); 
Data_out  :  OUT  std_logic_vector  (23  downto  0) ; 

Enable  :  IN  std  logic; 

Resetn  :  IN  std  logic; 

Scan  Data  In  :  IN  std  logic; 

Scan  Enable  :  IN  std  logic 
) ;  ~ 

END  twenty  four  bit  reg  single; 

--  internal  structure 

ARCHITECTURE  structural  OF  twenty  four  bit  reg  single  IS 
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COMPONENTS 


Component  twelve  bit  reg  single 
PORT  (  -  -  - 

Clock  :  IN  std  logic; 

Data  In  :  IN  std  logic  vector(ll  downto  0); 
Data_out  :  OUT  std_logic_vector ( 1 1  downto  0) ; 
Enable  :  IN  std  logic; 

Resetn  :  IN  std  logic; 

Scan  Data  In  :  IN  std  logic; 

Scan  Enable  :  IN  std  logic 
) ;  ~ 

END  Component; 


--  SIGNALS 

SIGNAL  Buf_Data  outll  :  std  logic; 


--  INSTANCES 
BEGIN 

Data_out(ll)  <=  Buf_Data_outl 1 ; 

twelve  bit  reg  singlel  :  twelve  bit  reg  single  PORT  MAP ( 
Clock  =>  Clock, 

Data  In  =>  Data  In(ll  downto  0), 

Data_Out(IO  downto  0)  =>  Data_Out(10  downto  0), 
Data_Out(ll)  =>  Buf_Data_outl 1 , 

Enable  =>  Enable, 

Resetn  =>  Resetn, 

Scan  Data  In  =>  Scan  Data  In, 

Scan  Enable  =>  Scan  Enable 
) ;  ~ 

twelve  bit  reg  singleC  :  twelve  bit  reg  single  PORT  MAP ( 
Clock  =>  Clock, 

Data  In  =>  Data  In (23  downto  12), 

Data_Out  =>  Data_Out(23  downto  12), 

Enable  =>  Enable, 

Resetn  =>  Resetn, 

Scan  Data  In  =>  Buf  Data  outll. 

Scan  Enable  =>  Scan  Enable 
) ;  ~ 

END  structural; 

19,  word_muxl6,vhd 

LIBRARY  IEEE; 

USE  IEEE. std_logic_II64. all; 


—  *****  word_muxl6  model  ***** 

--  external  ports 

ENTITY  word_muxI6  IS  PORT  ( 

In  WordO  :  IN  std  logic  vector(15  downto  0) ; 
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In  Wordl  :  IN  std  logic  vector(I5  downto  0); 
In_Word2  :  IN  std  logic_vector ( 15  downto  0); 
In  Words  :  IN  std_logic_vector ( 15  downto  0); 
In  Word4  :  IN  std  logic  vector(15  downto  0); 

In  Word5  :  IN  std  logic  vector(15  downto  0); 

In  Word6  :  IN  std  logic  vector(15  downto  0); 

In  Word7  :  IN  std  logic  vector(15  downto  0); 

In  Words  :  IN  std_logic  vector(15  downto  0); 

In  Word9  :  IN  std  logic  vector(15  downto  0); 

In  WordIO  :  IN  std  logic  vector(15  downto  0) ; 

In  Wordll  :  IN  std  logic  vector(15  downto  0); 

In  WordI2  :  IN  std  logic  vector(15  downto  0); 

In  WordlS  :  IN  std  logic_vector ( 15  downto  0) ; 
In  Wordl4  :  IN  std  logic  vector(15  downto  0); 

In  Wordl5  :  IN  std  logic  vector(15  downto  0) ; 

Out_word  :  Out  std_logic_vector ( 15  downto  0) ; 
Sel  :  IN  std_logic_vector ( 3  downto  0) 

)  ; 

END  word  muxI6; 

--  internal  structure 
ARCHITECTURE  rtl  OF  word  muxI6  IS 


BEGIN 

with  sel  select 

Out  word  <=  In  WordO  when  "0000", 

In_Wordl  when  "0001", 

In  Word2  when  "0010", 
In  Words  when  "0011", 
In  Word4  when  "0100", 
In_Word5  when  "0101", 
In  Words  when  "0110", 
In  Words  when  "0111", 
In  Words  when  "1000", 
In  Word9  when  "1001", 
In  WordIO  when  "1010", 
In  Wordll  when  "1011", 
In  Wordl2  when  "1100", 
In  WordlS  when  "1101", 
In  Wordl4  when  "1110", 
In  Wordl5  when  others; 

END  rtl; 


20.  word_mux3,vhd 

LIBRARY  IEEE; 

USE  IEEE. std_logic_1164. all; 

—  *****  word_mux3  model  ***** 

--  external  ports 

ENTITY  word_mux3  IS  PORT  ( 

A  :  IN  std  logic  vector(15  downto  0)  ; 
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B  :  IN  std  logic  vector(15  downto  0); 

C  :  IN  std_logic_vector ( 15  downto  0); 
Out_word  :  Out  std_logic_vector ( 15  downto  0) 
Sel  :  IN  std  logic  vector (1  downto  0) 

)  ; 

END  word  mux3; 

--  internal  structure 
ARCHITECTURE  rtl  OF  word_mux3  IS 
BEGIN 

process  (A,  B,  C,  Sel) 
begin 

case  sel  is 

when  "00"  =>  Out  word  <=  A; 
when  "01"  =>  Out  word  <=  B; 
when  others  =>  Out_word  <=  C; 
end  case; 
end  process; 

END  rtl; 


21,  word_mux4,vhd 

LIBRARY  IEEE; 

USE  IEEE. std_logic_II64. all; 

—  *****  word_mux4  model  ***** 

--  external  ports 

ENTITY  word_mux4  IS  PORT  ( 

A  :  IN  std  logic  vector(15  downto  0); 

B  :  IN  std  logic  vector(15  downto  0); 

C  :  IN  std_logic_vector ( 15  downto  0); 

D  :  IN  std  logic  vector(15  downto  0); 
Out_word  :  Out  std_logic_vector ( 15  downto  0) 
Sel  :  IN  std_logic  vector (1  downto  0) 

)  ; 

END  word  mux4; 

--  internal  structure 
ARCHITECTURE  rtl  OF  word_mux4  IS 
BEGIN 

process  (A,  B,  C,  D,  Sel) 
begin 

case  sel  is 


when 

A 

II 

o 

o 

Out 

word 

<= 

A; 

when 

A 

II 

\ — 1 
o 

Out 

word 

<= 

B; 

when 

"10"  => 

Out 

word 

<= 

C; 

when 

others  = 

=>  Out  word 

<=  D; 

end  case; 
end  process; 
END  rtl; 
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22, 


wordregsingle.vhd 

LIBRARY  IEEE; 

USE  IEEE.std_logic_II64.all; 

__  *****  reg  single  model  ***** 

--  external  ports 

ENTITY  word_reg_single  IS  PORT  ( 

Clock  :  IN  std  logic; 

Data  In  :  IN  std  logic  vector  (15  downto  0); 
Data_out  :  OUT  std_logic_vector  (15  downto  0) 
Enable  :  IN  std  logic; 

Resetn  :  IN  std  logic; 

Scan  Data  In  :  IN  std  logic; 

Scan  Enable  :  IN  std  logic 
) ;  ~ 

END  word  reg  single; 

--  internal  structure 

ARCHITECTURE  structural  OF  word  reg  single  IS 


--  COMPONENTS 

COMPONENT  scan_reg 
PORT  ( 

elk  :  IN  std  logic; 
data  in  :  IN  std  logic; 
data_out  :  OUT  std_logic; 
enable  :  IN  std  logic; 
resetn  :  IN  std  logic; 
scan  data  in  :  IN  std  logic; 
scan  enable  :  IN  std  logic 
) ;  ~ 

END  COMPONENT; 

--  SIGNALS 

SIGNAL  Buf  Data  out  :  std  logic  vector (14  downto  0) 


--  INSTANCES 
BEGIN 


Data_out 
Data_out 
Data_out 
Data_out 
Data_out 
Data_out 
Data_out 
Data_out 
Data_out 
Data_out 
Data_out 
Data  out 


(0) 

<= 

Buf 

Data 

out  ( 0 )  ; 

(1) 

<= 

Buf 

Data 

out  ( 1 )  ; 

(2) 

<= 

Buf 

Data 

out  (2 ) ; 

(3) 

<= 

Buf 

Data 

out  ( 3 )  ; 

(4) 

<= 

Buf 

Data 

out  ( 4 )  ; 

(5) 

<= 

Buf 

Data 

out  ( 5 )  ; 

(6) 

<= 

Buf 

Data 

out  ( 6) ; 

(7) 

<= 

Buf 

Data 

out  ( 7 )  ; 

(8) 

<= 

Buf 

Data 

out  ( 8 )  ; 

(9) 

<= 

Buf 

Data 

out  ( 9 )  ; 

(10)  <=  Buf_Data_out (10) ; 

(11)  <=  Buf  Data  out (11); 
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Data_out(12)  <=  Buf_Data_out(12); 

Data_out(13)  <=  Buf_Data_out (13) ; 

Data_out(14)  <=  Buf_Data_out ( 14 ) ; 

scan  reg  1  :  scan  reg  PORT  MAP ( 
elk  =>  Clock, 
data_in  =>  Data_In(l), 
data_out  =>  Buf_Data_out ( 1 ) , 
enable  =>  Enable, 
resetn  =>  Resetn, 
scan_data_in  =>  Buf_Data_out ( 0 ) , 
scan  enable  =>  Scan  Enable 

)  ; 

scan  reg  2  :  scan  reg  PORT  MAP ( 
elk  =>  Clock, 
data  in  =>  Data  In (2), 
data_out  =>  Buf_Data_out (2 )  , 
enable  =>  Enable, 
resetn  =>  Resetn, 
scan_data_in  =>  Buf_Data_out ( 1 ) , 
scan  enable  =>  Scan  Enable 

)  ; 

scan  reg  3  :  scan  reg  PORT  MAP ( 
elk  =>  Clock, 
data_in  =>  Data_In(3), 
data_out  =>  Buf_Data_out ( 3 ) , 
enable  =>  Enable, 
resetn  =>  Resetn, 

scan_data_in  =>  Buf_Data_out (2 ) , 
scan  enable  =>  Scan  Enable 

)  ; 

scan  reg  4  :  scan  reg  PORT  MAP ( 
elk  =>  Clock, 
data_in  =>  Data_In(4), 
data_out  =>  Buf_Data_out ( 4 ) , 
enable  =>  Enable, 
resetn  =>  Resetn, 
scan_data_in  =>  Buf_Data_out ( 3 ) , 
scan  enable  =>  Scan  Enable 

)  ; 

scan  reg  6  :  scan  reg  PORT  MAP ( 
elk  =>  Clock, 
data_in  =>  Data_In(5), 
data_out  =>  Buf_Data_out ( 5 ) , 
enable  =>  Enable, 
resetn  =>  Resetn, 

scan_data_in  =>  Buf_Data_out ( 4 ) , 
scan_enable  =>  Scan  Enable 

)  ; 

scan  reg  7  :  scan  reg  PORT  MAP ( 
elk  =>  Clock, 
data_in  =>  Data_In(6), 
data_out  =>  Buf_Data_out ( 6) , 
enable  =>  Enable, 
resetn  =>  Resetn, 
scan_data_in  =>  Buf_Data_out ( 5 ) , 
scan  enable  =>  Scan  Enable 
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)  ; 

scan  reg  8  :  scan  reg  PORT  MAP ( 
elk  =>  Clock, 
data_in  =>  Data_In(7), 
data_out  =>  Buf_Data_out ( 7 ) , 
enable  =>  Enable, 
resetn  =>  Resetn, 
scan_data_in  =>  Buf_Data_out ( 6) , 
scan  enable  =>  Scan  Enable 

)  ; 

scan  reg  9  :  scan  reg  PORT  MAP ( 
elk  =>  Clock, 
data_in  =>  Data_In(8), 
data_out  =>  Buf_Data_out ( 8 ) , 
enable  =>  Enable, 
resetn  =>  Resetn, 

scan_data_in  =>  Buf_Data_out ( 7 ) , 
scan  enable  =>  Scan  Enable 

)  ; 

scan  reg  10  :  scan  reg  PORT  MAP ( 
elk  =>  Clock, 
data_in  =>  Data_In(9), 
data_out  =>  Buf_Data_out ( 9 ) , 
enable  =>  Enable, 
resetn  =>  Resetn, 
scan_data_in  =>  Buf_Data_out ( 8 )  , 
scan  enable  =>  Scan  Enable 

)  ; 

scan  reg  11  :  scan  reg  PORT  MAP ( 
elk  =>  Clock, 
data_in  =>  Data_In(10), 
data_out  =>  Buf_Data_out (10) , 
enable  =>  Enable, 
resetn  =>  Resetn, 
scan_data_in  =>  Buf_Data_out ( 9 )  , 
scan  enable  =>  Scan  Enable 

)  ; 

scan  reg  12  :  scan_reg  PORT  MAP ( 
elk  =>  Clock, 
data  in  =>  Data  In(ll), 
data_out  =>  Buf_Data_out ( 1 1 ) , 
enable  =>  Enable, 
resetn  =>  Resetn, 

scan_data_in  =>  Buf_Data_out (10)  , 
scan  enable  =>  Scan  Enable 

)  ; 

scan  reg  13  :  scan  reg  PORT  MAP ( 
elk  =>  Clock, 
data  in  =>  Data  In(12), 
data_out  =>  Buf_Data_out ( 12 ) , 
enable  =>  Enable, 
resetn  =>  Resetn, 

scan_data_in  =>  Buf_Data_out ( 1 1 )  , 
scan  enable  =>  Scan  Enable 

)  ; 

scan  reg  14  :  scan  reg  PORT  MAP ( 
elk  =>  Clock, 
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data_in  =>  Data_In(13), 
data_out  =>  Buf_Data_out (13) , 
enable  =>  Enable, 
resetn  =>  Resetn, 

scan  data  in  =>  Buf  Data  out (12), 
scan  enable  =>  Scan  Enable 

)  ; 

scan  reg  15  :  scan_reg  PORT  MAP ( 
elk  =>  Clock, 
data  in  =>  Data  In(14), 
data_out  =>  Buf_Data_out ( 14 ) , 
enable  =>  Enable, 
resetn  =>  Resetn, 

scan_data_in  =>  Buf_Data_out (13) , 
scan  enable  =>  Scan  Enable 

)  ; 

scan  reg  16  :  scan  reg  PORT  MAP ( 
elk  =>  Clock, 
data_in  =>  Data_In(15), 
data_out  =>  Data_out (15) , 
enable  =>  Enable, 
resetn  =>  Resetn, 

scan_data_in  =>  Buf_Data_out ( 14 ) , 
scan  enable  =>  Scan  Enable 

)  ; 

scan  reg  5  :  scan  reg  PORT  MAP ( 
elk  =>  Clock, 
data_in  =>  Data_ln(0), 
data_out  =>  Buf_Data_out ( 0 ) , 
enable  =>  Enable, 
resetn  =>  Resetn, 
scan  data  in  =>  Scan  Data  In, 
scan  enable  =>  Scan  Enable 

)  ; 

END  structural; 

23.  wordset.vhd 

LIBRARY  IEEE; 

USE  IEEE. std_logic_II64. all; 

—  *****  word_set  model  ***** 

--  external  ports 

ENTITY  word_set  IS  PORT  ( 

In  word  :  IN  std  logic  vector  (15  downto  0) 
set_op  :  IN  std_logic_vector  (2  downto  0) ; 
set_out  :  OUT  std_logic 

)  ; 

END  word  set; 

--  internal  structure 

ARCHITECTURE  rtl  OF  word_set  IS 

component  zero  test 

PORT  (  ~ 

In  word  :  in  std  logic  vector(15  downto  0); 
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zero  flag  :  OUT  std  logic 

)  ; 

END  component; 

signal  zero  flag  :  std  logic; 


begin 

process  (In_word,  set_op,  zero_flag) 
begin 


ise  set  op 

is 

when 

"000" 

=> 

set 

out 

<= 

zero  flag; 

when 

"001" 

=> 

set 

out 

<= 

(not (In  word (15) 

)  or 

when 

"010" 

=> 

set 

out 

<= 

not (In  word (15) ) 

and 

when 

"Oil" 

=> 

set 

out 

<= 

(In  word(15)  or 

zero 

when 

"100" 

=> 

set 

out 

<= 

In  word (15) ; 

when  others  =>  set  out  <=  not (zero  flag); 
end  case; 
end  process; 

zero_testl  :  zero_test  port  map  ( 

In  word  =>  In  word, 
zero  flag  =>  zero  flag 

)  ; 


END  rtl; 

24,  zerotest.vhd 

LIBRARY  IEEE; 

USE  IEEE. std_logic_1164. all; 

—  *****  zero_test  model  ***** 

--  external  ports 

ENTITY  zero_test  IS  PORT  ( 

In  word  :  in  std_logic  vector (15  downto  0) 
zero  flag  :  OUT  std  logic 

)  ; 

END  zero  test; 

--  internal  structure 
ARCHITECTURE  rtl  OF  zero_test  IS 
begin 

process  (In_word) 
begin 

if  (In  word  =  "0000000000000000")  then 
zero  flag  <=  ' 1 ' ; 
else  zero  flag  <=  'O'; 
end  if; 
end  process; 

END  rtl; 


zero_flag) ; 
not (zero_flag) 
flag)  ; 
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APPENDIX  E:  GLOSSARY 


BGA 

Ball  Grid  Array 

CFTP 

Configurable  Fault-Tolerant  Proeessor 

COTS 

Commereial  Off  the  Shelf 

Coregen 

CORE  generator 

CPLD 

Complex  Programmable  Logie  Deviee 

ESSD 

Error  Syndrome  Storage  Deviee 

FPGA 

Field  Programmable  Gate  Array 

HDL 

Hardware  Deseription  Language 

lAR 

Interrupt  Address  Register 

ISR 

Interrupt  Service  Routine 

LEO 

Low-Earth  Orbit 

Mem 

Memory 

NPS 

Naval  Postgraduate  School 

Opeode 

Operation  code 

RADHARD 

Radiation  Hardened 

RAM 

Ramdom-Access  Memory 

REE 

Return  From  Exception 

RISC 

Reduced  Instruction  Set  Computer 

ROM 

Read-Only  Memory 

SEB 

Single  Event  Burnout 

SEE 

Single  Event  Effects 

SEE 

Single  Event  Latchup 
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SEP 

Single  Event  Phenomenon 

SERB 

Spaee  Experiment  Review  Board 

SEU 

Single  Event  Elpset 

SOC 

System  On  a  Chip 

SPED 

Sequential  (or  Simple)  Programmable  Eogic  Deviee 

SEP 

Spaee  Test  Program 

TMR 

Triple  Modular  Redundaney 

VHSIC 

Very  High  Speed  Integrated  Cireuit 

VDHE 

VHSIC  Hardware  Deseription  Eanguage 

WB 

Write  Baek 
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